<a href="https://colab.research.google.com/github/jeanmaxcacacho/199.X-Mformer/blob/main/Mformer_replicate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### TODOs in this notebook:
*   get sample data from our actual dataset
*   instantiate all Mformer foundations
*   obtain moral foundation measurements for all foundations

In [13]:
from transformers import AutoTokenizer
from transformers import AutoModelForSequenceClassification
import torch

FOUNDATIONS = [
    "authority",
    "care",
    "fairness",
    "loyalty",
    "sanctity"
]

MODEL_FOUNDATIONS = []
for F in FOUNDATIONS:
  MODEL_FOUNDATIONS.append(f"joshnguyen/mformer-{F}")

"""
Pretrained models locked and loaded in MODELS as:
(tokenizer, model)

[0] - [4] respectively
1. authority
2. care
3. fairness
4. loyalty
5. sanctity
"""

MODELS = []
for MODEL_FOUNDATION in MODEL_FOUNDATIONS:
  MODELS.append((
      AutoTokenizer.from_pretrained(MODEL_FOUNDATION),
      AutoModelForSequenceClassification.from_pretrained(
          MODEL_FOUNDATION,
          device_map="auto"
      )
  ))

In [5]:
"""
Load 5 texts from the dataset: https://www.kaggle.com/datasets/asaniczka/reddit-on-israel-palestine-daily-updated
...then perform inference

Sample texts were obtained arbitrarily from the preview table, these were not properly sampled!
"""

instances = [
  "who confiscates a small child's bike? nazis",
  "Why aren't they deliberately hitting schools and hospitals? Amateurs!",
  "What are you expecting? That the UN will declare a war on a country to save the sorry hide of hamas?",
  "Superman says NO to genocide!",
  "Only criminals hide their faces. Hope they het the pineapple treatment on hell"
]

In [14]:
# AUTHORITY
authority_tokenizer = MODELS[0][0]
authority_classifier = MODELS[0][1]

authority_inputs = authority_tokenizer(
    instances,
    padding=True,
    truncation=True,
    return_tensors='pt'
).to(authority_classifier.device)

authority_classifier.eval()
with torch.no_grad():
  authority_outputs = authority_classifier(**authority_inputs)

authority_probs = torch.softmax(authority_outputs.logits, dim=1)
authority_probs = authority_probs[:, 1]
authority_probs = authority_probs.detach().cpu().numpy()

# Print results
print(f"Probability of foundation 'AUTHORITY':", "\n")
for instance, prob in zip(instances, authority_probs):
    print(instance, "->", prob, "\n")

Probability of foundation 'AUTHORITY': 

who confiscates a small child's bike? nazis -> 0.3475795 

Why aren't they deliberately hitting schools and hospitals? Amateurs! -> 0.2956997 

What are you expecting? That the UN will declare a war on a country to save the sorry hide of hamas? -> 0.27424964 

Superman says NO to genocide! -> 0.22523473 

Only criminals hide their faces. Hope they het the pineapple treatment on hell -> 0.22267361 

