# üìì Fine-Tuning BioBERT for Healthcare Claim Credibility Classification

This notebook demonstrates the process of fine-tuning BioBERT, a transformer model pretrained on biomedical text, for the task of classifying healthcare claims as credible or not credible. The final model will output a softmax score interpreted as a credibility percentage and be used in the MediSense Streamlit app.

## üîß 1. Import Dependencies

This section loads all the essential libraries needed for data loading, preprocessing, model training, and evaluation.

In [1]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [2]:
import os
os.chdir('/content/drive/My Drive/Transformers-Final-proj')


In [3]:
%cd /content/drive/My Drive/Transformers-Final-proj

/content


In [4]:
!pip install "accelerate>=0.26.0"
!pip uninstall transformers datasets -y
!pip install transformers datasets --upgrade
!pip install tf-keras

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.0.0->accelerate>=0.26.0)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=2.0.0->accelerate>=0.26.0)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=2.0.0->accelerate>=0.26.0)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=2.0.0->accelerate>=0.26.0)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=2.0.0->accelerate>=0.26.0)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch>=2.0.0->accelerate>=0.26.0)
  Downloading nvidia_cuff

In [4]:
from transformers import TrainingArguments
print('Transformers OK!')

Transformers OK!


In [5]:
!python -c "from transformers import TrainingArguments; print('Transformers OK!')"

Transformers OK!


In [5]:
import pandas as pd
import torch
from sklearn.model_selection import train_test_split
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from transformers import DataCollatorWithPadding
from datasets import Dataset, DatasetDict
import numpy as np
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score


## üìÇ 2. Load and Preview the Dataset

We load the claims.csv file from the Monant Medical Misinformation dataset, which contains healthcare claims and their corresponding credibility ratings.

In [6]:
# Load dataset
claims_df = pd.read_csv("claims.csv")

## üßº 3. Clean and Prepare the Dataset

The labels are renamed from `'TRUE'/'FALSE'` to numeric values `1/0`, which are required for binary classification.

In [7]:
# Label mapping
def map_label(rating):
    if rating in ["true", "mostly-true"]:
        return 1
    elif rating in ["false", "mostly-false", "mixture", "unknown"]:
        return 0
    else:
        return None

The function `map_label` is responsible for mapping the original labels to binary classes:
- **Credible claims (1)**: If the claim's rating is either `"true"` or `"mostly-true"`, it is considered **credible**, and we assign it a value of `1`.
- **Non-credible claims (0)**: If the rating is `"false"`, `"mostly-false"`, `"mixture"`, or `"unknown"`, it is considered **not credible**, and we assign it a value of `0`.

This conversion simplifies the classification by reducing it to two categories: credible (1) and non-credible (0).

## ‚úÇÔ∏è 4. Split the Dataset into Train and Test Sets

We drop any rows with missing labels or statements, split the data using stratified sampling to ensure both train and test datasets preserve the label distribution, and convert them into Hugging Face DatasetDict format for use with the Trainer API.

In [8]:
claims_df["label"] = claims_df["rating"].map(map_label)
claims_df = claims_df.dropna(subset=["label", "statement"])

# Train-test split
train_df, test_df = train_test_split(claims_df[["statement", "label"]], test_size=0.2, random_state=42)

# Ensure labels are integers
train_df["label"] = train_df["label"].astype(int)
test_df["label"] = test_df["label"].astype(int)

# Convert to Huggingface Dataset
dataset = DatasetDict({
    "train": Dataset.from_pandas(train_df),
    "test": Dataset.from_pandas(test_df)
})

In this step, we:
1. **Assign binary labels**: The `map_label` function is used to convert the ratings into binary labels (0 or 1) and assigns them to a new column called `label`.
2. **Remove missing values**: We remove any rows where the `label` or `statement` column is missing, as these would not be useful for training.

Train-Test Split:

- **Training and Testing Split**: The dataset is split into **train** and **test** datasets using an 80-20% split. 
- `train_test_split` ensures that the model will train on 80% of the data and evaluate on the remaining 20%.

Data Formatting:

- **Convert labels to integers**: The `label` column is explicitly converted to integers to ensure consistency for model training.
  
Convert to Hugging Face Dataset:

- The data is converted into a format that can be used by Hugging Face's Trainer API. We use `Dataset.from_pandas()` to convert the pandas dataframes (`train_df`, `test_df`) into Hugging Face `Dataset` objects and store them in a `DatasetDict`.

## üß† 5. Load Pretrained BioBERT for Sequence Classification

We load the pre-trained BioBERT tokenizer and model, then tokenize the healthcare statements. BioBERT is pre-trained on biomedical literature, making it well-suited for this task.

In [9]:
# Load tokenizer
model_checkpoint = "dmis-lab/biobert-base-cased-v1.1"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

# Tokenization
def tokenize(example):
    return tokenizer(example["statement"], truncation=True)

tokenized_datasets = dataset.map(tokenize, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

# Load model
model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint, num_labels=2)



The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/313 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

Map:   0%|          | 0/1895 [00:00<?, ? examples/s]

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


Map:   0%|          | 0/474 [00:00<?, ? examples/s]

pytorch_model.bin:   0%|          | 0.00/436M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/436M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at dmis-lab/biobert-base-cased-v1.1 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


As our next step, we load the tokenizer and model, and prepare the dataset for tokenization.

1. **Loading the Tokenizer**: 
   - The tokenizer is loaded using the `AutoTokenizer.from_pretrained()` method with the model checkpoint `"dmis-lab/biobert-base-cased-v1.1"`. This tokenizer is used to process text data in the same way as the BioBERT model.

2. **Tokenization**: 
   - The `tokenize` function takes each example (claim) and applies the tokenizer to the `"statement"` column, truncating the text to a maximum length, ensuring the input text fits the model's requirements.

3. **Applying Tokenization**: 
   - The `map()` function is used to apply the `tokenize` function to the entire dataset (`train` and `test` splits), processing the dataset in batches with `batched=True`.

4. **Data Collator**: 
   - A `DataCollatorWithPadding` is created to handle padding during training. This ensures all input sequences are the same length by adding padding tokens where necessary.

5. **Loading the Model**: 
   - The model is loaded with `AutoModelForSequenceClassification.from_pretrained()`, using the same checkpoint as the tokenizer. The model is configured for **binary classification** (`num_labels=2`), which is appropriate for the task of classifying claims as credible or not credible.


## üìâ 6. Define Evaluation Metrics

We define accuracy, precision, recall, and F1 score to evaluate how well the model distinguishes between credible and non-credible claims.

In [10]:
# Metrics
def compute_metrics(pred):
    labels = pred.label_ids
    preds = np.argmax(pred.predictions, axis=1)
    return {
        "accuracy": accuracy_score(labels, preds),
        "f1": f1_score(labels, preds),
        "precision": precision_score(labels, preds),
        "recall": recall_score(labels, preds),
    }

This function computes evaluation metrics for the model's predictions.

1. **Input**: 
   - The function receives `pred`, which contains the model's predictions and the true labels.

2. **Extracting Labels and Predictions**: 
   - `labels = pred.label_ids` extracts the true labels from the predictions object.
   - `preds = np.argmax(pred.predictions, axis=1)` extracts the predicted labels by selecting the class with the highest probability (since it's a binary classification problem).

3. **Metrics Calculation**:
   - The function computes the following metrics:
     - **Accuracy**: Measures the proportion of correct predictions (`accuracy_score`).
     - **F1 Score**: Provides a balance between precision and recall (`f1_score`).
     - **Precision**: Measures the proportion of correct positive predictions (`precision_score`).
     - **Recall**: Measures the proportion of actual positives correctly identified (`recall_score`).

4. **Output**: 
   - The function returns a dictionary containing the computed metrics: accuracy, F1 score, precision, and recall.


## ‚öôÔ∏è 7. Set Up Training Arguments

We configure training parameters, such as batch size, number of epochs, evaluation strategy, and model saving behavior.

In [11]:
# Training args
training_args = TrainingArguments(
    gradient_accumulation_steps=4,
    output_dir="./biobert_misinformation",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_steps=10,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01,
    save_total_limit=1,
    fp16=True,
    load_best_model_at_end=True,
    metric_for_best_model="f1"
)




This allows us to define the training arguments that will guide the training process.

1. **Gradient Accumulation**: 
   - `gradient_accumulation_steps=4`: This helps manage memory by accumulating gradients over 4 steps before updating the model weights.

2. **Output Directory**: 
   - `output_dir="./biobert_misinformation"`: Specifies the directory where the model checkpoints and training logs will be saved.

3. **Evaluation and Saving Strategies**:
   - `evaluation_strategy="epoch"`: The model will be evaluated at the end of each epoch.
   - `save_strategy="epoch"`: The model will be saved at the end of each epoch.

4. **Logging**: 
   - `logging_steps=10`: The training logs will be saved every 10 steps during training.

5. **Batch Size**: 
   - `per_device_train_batch_size=8`: Specifies the batch size for training on each device (GPU/CPU).
   - `per_device_eval_batch_size=8`: Specifies the batch size for evaluation on each device.

6. **Epochs and Weight Decay**:
   - `num_train_epochs=3`: The model will be trained for 3 epochs.
   - `weight_decay=0.01`: Applies weight decay to prevent overfitting.

7. **Model Saving**:
   - `save_total_limit=1`: Only the latest model checkpoint will be kept, limiting disk usage.

8. **Mixed Precision Training**:
   - `fp16=True`: Enables mixed precision training, which reduces memory usage and speeds up training on GPUs with support for 16-bit floats.

9. **Best Model Selection**:
   - `load_best_model_at_end=True`: The model will automatically load the best checkpoint after training ends.
   - `metric_for_best_model="f1"`: The model will be evaluated based on the F1 score to determine the best model.


## üèãÔ∏è 8. Initialize Trainer and Start Fine-Tuning

We use Hugging Face‚Äôs Trainer API to train the model and handle evaluation, checkpointing, and logging automatically.

In [12]:
# import torch
# torch.cuda.empty_cache()

# import os
# os.environ["PYTORCH_MPS_HIGH_WATERMARK_RATIO"] = "0.0"


In [13]:
# Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics
)

# Train
trainer.train()

  trainer = Trainer(
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33m11199a186[0m ([33m11199a186-vanderbilt-university[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Epoch,Training Loss,Validation Loss,Accuracy,F1,Precision,Recall
1,0.4085,0.443731,0.824895,0.278261,0.246154,0.32
2,0.1805,0.348909,0.871308,0.207792,0.296296,0.16


TrainOutput(global_step=177, training_loss=0.31061335876163115, metrics={'train_runtime': 106.5065, 'train_samples_per_second': 53.377, 'train_steps_per_second': 1.662, 'total_flos': 121465321484700.0, 'train_loss': 0.31061335876163115, 'epoch': 2.962025316455696})

This code sets up and runs the training process using the `Trainer` class from the Hugging Face library.

1. **Trainer Setup**:
   - `model=model`: Specifies the model to be trained (the previously loaded BART or BioBERT model).
   - `args=training_args`: Passes the previously defined training arguments (`TrainingArguments`) to configure the training process.
   - `train_dataset=tokenized_datasets["train"]`: Provides the tokenized training dataset for model training.
   - `eval_dataset=tokenized_datasets["test"]`: Provides the tokenized validation dataset for model evaluation.
   - `tokenizer=tokenizer`: Passes the tokenizer to ensure text is correctly processed during training.
   - `data_collator=data_collator`: Specifies the data collator that handles dynamic padding during training.
   - `compute_metrics=compute_metrics`: Specifies the function to compute evaluation metrics (accuracy, F1 score, etc.) during training.

2. **Training**:
   - `trainer.train()`: This starts the training process using the specified model, datasets, and training arguments. The model will be trained on the training dataset and evaluated on the validation dataset at each epoch.


## ‚úÖ 9. Run Final Evaluation on Test Set

After training, we evaluate the model on the test set and print key performance metrics to assess its effectiveness.

In [14]:
# Evaluate
metrics = trainer.evaluate()
print(metrics)

{'eval_loss': 0.4437311887741089, 'eval_accuracy': 0.8248945147679325, 'eval_f1': 0.2782608695652174, 'eval_precision': 0.24615384615384617, 'eval_recall': 0.32, 'eval_runtime': 1.1123, 'eval_samples_per_second': 426.126, 'eval_steps_per_second': 53.94, 'epoch': 2.962025316455696}


## üíæ 10. Save the Fine-Tuned Model and Tokenizer

The model and tokenizer are saved to a folder called `biobert_misinformation_model/`, which will later be used in the Streamlit frontend.

In [16]:
# Save model and tokenizer
model.save_pretrained("/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model")
tokenizer.save_pretrained("/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model")


('/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model/tokenizer_config.json',
 '/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model/special_tokens_map.json',
 '/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model/vocab.txt',
 '/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model/added_tokens.json',
 '/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model/tokenizer.json')

In [18]:
# Prepare Explanation Dataset
import pandas as pd

claims_df = pd.read_csv("claims.csv")

# Only keep claims with a usable description (non-null, decently long)
explanation_df = claims_df[["statement", "description"]].dropna()
explanation_df = explanation_df[explanation_df["description"].str.len() > 30]

# Optional: combine rating into target to enrich generation
def create_target(row):
    return f"This claim is rated as '{row['rating']}'. {row['description']}"

explanation_df["target"] = claims_df.loc[explanation_df.index].apply(create_target, axis=1)

# Keep only statement + enriched target
final_explanation_data = explanation_df[["statement", "target"]]
final_explanation_data = final_explanation_data.dropna()

# Save to CSV
final_explanation_data.to_csv("/content/drive/My Drive/Transformers-Final-proj/claim_explanations.csv", index=False)
print("Saved claim_explanations.csv with", len(final_explanation_data), "rows.")


Saved claim_explanations.csv with 2608 rows.


## üí¨ 11. Model Inference

This portion of the notebook conducts model inference to determine what outputs would be generated using the trained model.

The confidence score here represents the model's confidence in the claim being credible.

1. Eating dark chocolate can improve heart health.
2. Running regularly can reduce the risk of chronic diseases like diabetes.
3. Taking multivitamins improves overall health.
4. Yoga helps in reducing stress levels.
5. Consuming too much caffeine can lead to increased anxiety.
6. Drinking energy drinks improves physical performance.
7. Eating pineapple helps with weight loss.
8. Wearing glasses can improve eyesight.
9. Consuming garlic prevents colds.
10. Meditation can cure mental health disorders.
11. Using a standing desk can significantly reduce back pain.
12. Drinking lemon water detoxifies the body.

In [5]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load the saved model and tokenizer
model_path = "/content/drive/My Drive/Transformers-Final-proj/biobert_misinformation_model"
model = AutoModelForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)


We're loading the previously saved model and tokenizer for inference. 

1. **Loading the Model**:
   - `model_path `: Specifies the path where the model is saved on Google Drive.
   - `model = AutoModelForSequenceClassification.from_pretrained(model_path)`: Loads the model from the saved checkpoint at the specified path. The model is of type `AutoModelForSequenceClassification`, which is suitable for classification tasks like determining claim credibility.

2. **Loading the Tokenizer**:
   - `tokenizer = AutoTokenizer.from_pretrained(model_path)`: Loads the tokenizer associated with the saved model. This tokenizer ensures that the input text is properly tokenized before being passed to the model for inference.


In [93]:
import torch
# Inference function for BioBERT to predict the credibility percentage
def predict_claim_credibility(claim):
    # Tokenize the input claim
    inputs = tokenizer(claim, return_tensors="pt", padding=True, truncation=True)

    # Get model output (logits)
    model.eval()  # Set model to evaluation mode
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits

    # Apply sigmoid to get probability (confidence score)
    probability = torch.sigmoid(logits)

    # Convert to percentage
    confidence_score = probability[0][1].item() * 100  # Confidence of the claim being credible
    return confidence_score


claim = "Wearing glasses can improve eyesight."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")


Claim: Wearing glasses can improve eyesight.
Credibility: 52.63313055038452%


This is an inference function to predict the credibility of a claim using the trained BioBERT model.

1. **Inference Function**:
   - `predict_claim_credibility(claim)`: This function takes a `claim` as input and predicts the **credibility percentage**.
   
2. **Tokenizing the Claim**:
   - `inputs = tokenizer`: The claim is tokenized into the format expected by the model, including padding and truncation to ensure the input size matches the model's requirements.

3. **Model Inference**:
   - `model.eval()`: Sets the model to evaluation mode (important for proper behavior during inference).
   - `with torch.no_grad()`: Disables gradient calculations to speed up inference and save memory.
   - `outputs = model(**inputs)`: The model processes the tokenized input and generates the output logits.
   
4. **Confidence Score Calculation**:
   - `logits = outputs.logits`: Extracts the raw logits (unscaled predictions) from the model's output.
   - `probability = torch.sigmoid(logits)`: Applies the sigmoid function to convert the logits into probabilities between 0 and 1, representing the confidence of the claim being credible.
   - `confidence_score = probability[0][1].item() * 100`: The confidence score for the claim being credible is extracted and multiplied by 100 to convert it into a percentage.

5. **Printing the Result**:
   - The function returns the calculated **credibility percentage** for the input claim.
   - The claim and its predicted credibility percentage are printed to the console.


In [67]:
claim = "In 2020, two school boys in China died suddenly after wearing face masks during physical exercise."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: In 2020, two school boys in China died suddenly after wearing face masks during physical exercise.
Credibility: 68.1206464767456%


In [58]:
claim = "In January 2020, the Food and Drug Administration approved a nasal spray containing cocaine."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: In January 2020, the Food and Drug Administration approved a nasal spray containing cocaine.
Credibility: 67.90470480918884%


In [62]:
claim = "A viral social media post in January 2020 represented an authentic, accurate 'health bulletin' about the new coronavirus outbreak from an official public health authority."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: A viral social media post in January 2020 represented an authentic, accurate 'health bulletin' about the new coronavirus outbreak from an official public health authority.
Credibility: 65.17950892448425%


In [34]:
claim = "A photograph shows vintage box of fake snow decor made of the carcinogen asbestos."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: A photograph shows vintage box of fake snow decor made of the carcinogen asbestos.
Credibility: 64.91941213607788%


In [61]:
claim = "A Kentucky couple were placed under house arrest in July 2020 after a woman diagnosed with COVID-19 refused to agree to self-isolate because it would require her to get prior approval to go to the hospital."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: A Kentucky couple were placed under house arrest in July 2020 after a woman diagnosed with COVID-19 refused to agree to self-isolate because it would require her to get prior approval to go to the hospital.
Credibility: 64.34209942817688%


In [90]:
claim = "A video shows a genuine public service announcement from 1956 about how people can avoid a future plague."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: A video shows a genuine public service announcement from 1956 about how people can avoid a future plague.
Credibility: 63.763582706451416%


In [72]:
claim = "Born Basic Anti-Bac hand sanitizer was recalled in the U.S. after being found to contain methanol, a poisonous chemical."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: Born Basic Anti-Bac hand sanitizer was recalled in the U.S. after being found to contain methanol, a poisonous chemical.
Credibility: 63.451576232910156%


In [106]:
claim = "A doctor in Italy shared numerous details about how hospitals in the country are dealing with COVID-19, a disease caused by the new coronavirus."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: A doctor in Italy shared numerous details about how hospitals in the country are dealing with COVID-19, a disease caused by the new coronavirus.
Credibility: 63.25032711029053%


In [52]:
claim = "Carsyn Leigh Davis died of COVID-19 shortly after attending a 'COVID party' at her youth church."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: Carsyn Leigh Davis died of COVID-19 shortly after attending a 'COVID party' at her youth church.
Credibility: 61.904627084732056%


In [45]:
claim = "A group of 43 European countries and territories have far more people than the UK, but fewer Covid-19 deaths."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: A group of 43 European countries and territories have far more people than the UK, but fewer Covid-19 deaths.
Credibility: 61.181193590164185%


In [64]:
claim = "Health experts predicted the new coronavirus could kill 65 million people."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: Health experts predicted the new coronavirus could kill 65 million people.
Credibility: 59.04858708381653%


In [108]:
claim = "On average, do men and women differ cognitively?"
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: On average, do men and women differ cognitively?
Credibility: 40.32866060733795%


In [107]:
claim = "The black lung lie: It‚Äôs the widespread belief that smokers‚Äô lungs turn black."
credibility_percentage = predict_claim_credibility(claim)
print(f"Claim: {claim}\nCredibility: {credibility_percentage}%")

Claim: The black lung lie: It‚Äôs the widespread belief that smokers‚Äô lungs turn black.
Credibility: 36.35234832763672%
