In [52]:
pip install -U sentence-transformers


Collecting sentence-transformers
  Downloading sentence_transformers-5.2.0-py3-none-any.whl.metadata (16 kB)
Downloading sentence_transformers-5.2.0-py3-none-any.whl (493 kB)
Installing collected packages: sentence-transformers
  Attempting uninstall: sentence-transformers
    Found existing installation: sentence-transformers 5.1.2
    Uninstalling sentence-transformers-5.1.2:
      Successfully uninstalled sentence-transformers-5.1.2
Successfully installed sentence-transformers-5.2.0


In [53]:
from sentence_transformers import SentenceTransformer

# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")

# The sentences to encode
sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]

# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)


# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)


(3, 384)
tensor([[1.0000, 0.6660, 0.1046],
        [0.6660, 1.0000, 0.1411],
        [0.1046, 0.1411, 1.0000]])


In [54]:
# The Socrates vs. Galileo Fight ¬°¬°

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
    "Todos los cuerpos caen a la misma velocidad.",
    "Los cuerpos mas pesados caen mas r√°pido.",
   ]

embeddings = model.encode(sentences)
print(embeddings.shape)


similarities = model.similarity(embeddings, embeddings)
print(similarities)


(2, 384)
tensor([[1.0000, 0.7738],
        [0.7738, 1.0000]])


In [55]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
    "Todos los cuerpos caen a la misma velocidad.",
    "La fotos√≠ntesis ocurre en los cloroplastos.",
   ]

embeddings = model.encode(sentences)
print(embeddings.shape)


similarities = model.similarity(embeddings, embeddings)
print(similarities)


(2, 384)
tensor([[1.0000, 0.4054],
        [0.4054, 1.0000]])


In [56]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
       "Todos los cuerpos caen a la misma velocidad.",
    "En ausencia de rozamiento, todos los objetos caen igual.",
   ]

embeddings = model.encode(sentences)
print(embeddings.shape)


similarities = model.similarity(embeddings, embeddings)
print(similarities)


(2, 384)
tensor([[1.0000, 0.4868],
        [0.4868, 1.0000]])


In [12]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
    "Todos los cuerpos caen a la misma velocidad.",
    "No es cierto que todos los cuerpos caigan a la misma velocidad.",
   ]

embeddings = model.encode(sentences)
print(embeddings.shape)


similarities = model.similarity(embeddings, embeddings)
print(similarities)


# Es la proyecci√≥n angular entre dos vectores.  
# No mide verdad, correcci√≥n cient√≠fica ni coherencia l√≥gica. Solo mide proximidad sem√°ntica aprendida estad√≠sticamente.

# Cuando el objetivo es: b√∫squeda / clustering el m√©todo es bueno
# Pero para detecci√≥n de contradicciones / consistencia l√≥gica ‚Üí este enfoque es insuficiente
# Explorar: modelos entrenados para NLI (entailment / contradiction) o o razonamiento expl√≠cito (LLMs con prompting controlado)

# https://www.sbert.net/docs/pretrained-models/nli-models.html
# Ver https://huggingface.co/papers/1705.02364 y https://arxiv.org/abs/1705.02364
# https://github.com/huggingface/sentence-transformers/blob/main/examples/sentence_transformer/training/nli/training_nli.py


(2, 384)
tensor([[1.0000, 0.9157],
        [0.9157, 1.0000]])


In [13]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")

# Two lists of sentences
sentences1 = [
    "The new movie is awesome",
    "The cat sits outside",
    "A man is playing guitar",
]

sentences2 = [
    "The dog plays in the garden",
    "The new movie is so great",
    "A woman watches TV",
]

# Compute embeddings for both lists
embeddings1 = model.encode(sentences1)
embeddings2 = model.encode(sentences2)

# Compute cosine similarities
similarities = model.similarity(embeddings1, embeddings2)

# Output the pairs with their score
for idx_i, sentence1 in enumerate(sentences1):
    print(sentence1)
    for idx_j, sentence2 in enumerate(sentences2):
        print(f" - {sentence2: <30}: {similarities[idx_i][idx_j]:.4f}")

The new movie is awesome
 - The dog plays in the garden   : 0.0543
 - The new movie is so great     : 0.8939
 - A woman watches TV            : -0.0502
The cat sits outside
 - The dog plays in the garden   : 0.2838
 - The new movie is so great     : -0.0029
 - A woman watches TV            : 0.1310
A man is playing guitar
 - The dog plays in the garden   : 0.2277
 - The new movie is so great     : -0.0136
 - A woman watches TV            : -0.0327


In [24]:
from sentence_transformers import SentenceTransformer, SimilarityFunction

# Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")

# Embed some sentences
sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities)


# Change the similarity function to Manhattan distance
model.similarity_fn_name = SimilarityFunction.MANHATTAN
print(model.similarity_fn_name)


similarities = model.similarity(embeddings, embeddings)
print(similarities)


tensor([[1.0000, 0.6660, 0.1046],
        [0.6660, 1.0000, 0.1411],
        [0.1046, 0.1411, 1.0000]])
manhattan
tensor([[ -0.0000, -12.6269, -20.2167],
        [-12.6269,  -0.0000, -20.1288],
        [-20.2167, -20.1288,  -0.0000]])


In [16]:
from transformers import pipeline

# Modelo NLI multiling√ºe (incluye espa√±ol)
clf = pipeline(
    task="text-classification",
    model="joeddav/xlm-roberta-large-xnli",
    tokenizer="joeddav/xlm-roberta-large-xnli",
    return_all_scores=True,
)

premise = "Todos los cuerpos caen a la misma velocidad."
hypothesis = "Los cuerpos mas pesados caen mas r√°pido."

# Para NLI: premise + hypothesis
out = clf({"text": premise, "text_pair": hypothesis})[0]

# Ordenar por score y mostrar
out_sorted = sorted(out, key=lambda x: x["score"], reverse=True)
for r in out_sorted:
    print(f"{r['label']:>12}: {r['score']:.4f}")

# Utiliza un modelo muy pesado ¬°¬°¬°¬°

config.json:   0%|          | 0.00/734 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/2.24G [00:00<?, ?B/s]


KeyboardInterrupt



In [18]:
import os, time
from transformers import AutoTokenizer

os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "1"
os.environ["HF_HUB_DISABLE_PROGRESS_BARS"] = "1"

model_name = "MoritzLaurer/mDeBERTa-v3-base-mnli-xnli"

print("Empiezo descarga tokenizer...")
t0 = time.time()

tokenizer = AutoTokenizer.from_pretrained(model_name)

print("Tokenizer descargado en", round(time.time() - t0, 1), "s")


Empiezo descarga tokenizer...


tokenizer_config.json: 0.00B [00:00, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


spm.model:   0%|          | 0.00/4.31M [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


tokenizer.json:   0%|          | 0.00/16.3M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/23.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/286 [00:00<?, ?B/s]

Tokenizer descargado en 44.0 s


In [29]:
import os, time, torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "1"
os.environ["HF_HUB_DISABLE_PROGRESS_BARS"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "false"

model_name = "MoritzLaurer/mDeBERTa-v3-base-mnli-xnli"

premise = "Todos los cuerpos caen a la misma velocidad."
hypothesis = "Los cuerpos mas pesados caen mas r√°pido."

t0 = time.time()
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()
print("Modelo cargado en", round(time.time() - t0, 1), "s")

inputs = tokenizer(premise, hypothesis, return_tensors="pt", truncation=True)

with torch.no_grad():
    logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1).squeeze().tolist()

# OJO: este modelo suele usar el orden [contradiction, neutral, entailment]
labels = ["CONTRADICTION", "NEUTRAL", "ENTAILMENT"]

for lab, p in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True):
    print(f"{lab:>13}: {p:.4f}")


# CONTRADICTION alto ‚Üí el modelo detecta incompatibilidad textual (esperable aqu√≠).
# NEUTRAL alto ‚Üí el modelo no ve contradicci√≥n ‚Äúl√≥gica‚Äù clara (pasa con frases ambiguas).
# ENTAILMENT alto ‚Üí algo raro (o frase mal tokenizada / hip√≥tesis distinta).


Modelo cargado en 2.2 s
      NEUTRAL: 0.9993
   ENTAILMENT: 0.0004
CONTRADICTION: 0.0003


In [31]:
{0: 'entailment', 1: 'neutral', 2: 'contradiction'}
{'contradiction': 2, 'entailment': 0, 'neutral': 1}

Modelo cargado en 2.1 s
      NEUTRAL: 0.9967
   ENTAILMENT: 0.0018
CONTRADICTION: 0.0015


In [32]:
print(model.config.id2label)
print(model.config.label2id)


{0: 'entailment', 1: 'neutral', 2: 'contradiction'}
{'contradiction': 2, 'entailment': 0, 'neutral': 1}


In [33]:
# Las etiquetas son correctas y el modelo no detecta la contradicci√≥n evidente

import os, time, torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "1"
os.environ["HF_HUB_DISABLE_PROGRESS_BARS"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "false"

model_name = "MoritzLaurer/mDeBERTa-v3-base-mnli-xnli"

premise = "Ning√∫n cuerpo, independientemente de su peso, cae m√°s r√°pido que otro."
hypothesis = "Algunos cuerpos m√°s pesados caen m√°s r√°pido que otros."

t0 = time.time()
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()
print("Modelo cargado en", round(time.time() - t0, 1), "s")

inputs = tokenizer(premise, hypothesis, return_tensors="pt", truncation=True)

with torch.no_grad():
    logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1).squeeze().tolist()

# OJO: este modelo suele usar el orden [contradiction, neutral, entailment]
labels = ["CONTRADICTION", "NEUTRAL", "ENTAILMENT"]

for lab, p in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True):
    print(f"{lab:>13}: {p:.4f}")


# CONTRADICTION alto ‚Üí el modelo detecta incompatibilidad textual (esperable aqu√≠).
# NEUTRAL alto ‚Üí el modelo no ve contradicci√≥n ‚Äúl√≥gica‚Äù clara (pasa con frases ambiguas).
# ENTAILMENT alto ‚Üí algo raro (o frase mal tokenizada / hip√≥tesis distinta).



Modelo cargado en 2.3 s
      NEUTRAL: 0.8617
   ENTAILMENT: 0.1354
CONTRADICTION: 0.0029


In [51]:
import os, time, torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "1"
os.environ["HF_HUB_DISABLE_PROGRESS_BARS"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "false"

model_name = "MoritzLaurer/mDeBERTa-v3-base-mnli-xnli"

#premise = "All bodies, regardless of their weight, fall at the same speed."
#hypothesis = "Heavier bodies fall at a higher speed."

#premise = "For any two bodies with different weights, both fall at the same speed."
#hypothesis = "There exists at least one pair of bodies with different weights such that the heavier one falls faster."

premise = "In a vacuum, all bodies fall at the same speed, regardless of their weight."
hypothesis = "In a vacuum, heavier bodies fall at a higher speed."

t0 = time.time()
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()
print("Modelo cargado en", round(time.time() - t0, 1), "s")

inputs = tokenizer(premise, hypothesis, return_tensors="pt", truncation=True)

with torch.no_grad():
    logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1).squeeze().tolist()

# OJO: este modelo suele usar el orden [contradiction, neutral, entailment]
labels = ["CONTRADICTION", "NEUTRAL", "ENTAILMENT"]

for lab, p in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True):
    print(f"{lab:>13}: {p:.4f}")


# CONTRADICTION alto ‚Üí el modelo detecta incompatibilidad textual (esperable aqu√≠).
# NEUTRAL alto ‚Üí el modelo no ve contradicci√≥n ‚Äúl√≥gica‚Äù clara (pasa con frases ambiguas).
# ENTAILMENT alto ‚Üí algo raro (o frase mal tokenizada / hip√≥tesis distinta).

Modelo cargado en 2.1 s
      NEUTRAL: 0.9766
CONTRADICTION: 0.0117
   ENTAILMENT: 0.0117


In [46]:
pip install -U sentencepiece

Collecting sentencepieceNote: you may need to restart the kernel to use updated packages.

  Downloading sentencepiece-0.2.1-cp312-cp312-win_amd64.whl.metadata (10 kB)
Downloading sentencepiece-0.2.1-cp312-cp312-win_amd64.whl (1.1 MB)
   ---------------------------------------- 0.0/1.1 MB ? eta -:--:--
   ----------------------------- ---------- 0.8/1.1 MB 4.8 MB/s eta 0:00:01
   ---------------------------------------- 1.1/1.1 MB 4.2 MB/s eta 0:00:00
Installing collected packages: sentencepiece
Successfully installed sentencepiece-0.2.1


In [None]:
¬øPor qu√© el modelo insiste en NEUTRAL?

Porque NLI no est√° evaluando f√≠sica ni leyes universales, sino esta pregunta: ‚ÄúSi la premisa es verdadera, ¬øla hip√≥tesis es necesariamente falsa?‚Äù

Y el modelo ha aprendido (de millones de ejemplos humanos) que:frases cient√≠ficas pueden estar mal formuladas

‚Äúfall at the same speed‚Äù puede interpretarse como: misma velocidad inicial. misma velocidad media, mismo r√©gimen idealizado

incluso con ‚Äúin a vacuum‚Äù, el corpus no penaliza con fuerza la contradicci√≥n

üëâ El modelo prefiere ser conservador y declarar neutral antes que cometer una falsa contradicci√≥n.

Esto es deliberado en su entrenamiento.

3Ô∏è‚É£ Conclusi√≥n fuerte (la que importa para tu uso real)

Has demostrado emp√≠ricamente que: Los modelos NLI gen√©ricos NO sirven para detectar contradicciones cient√≠ficas o f√≠sicas si la contradicci√≥n depende de supuestos te√≥ricos.

Sirven para: contradicciones textuales, negaciones expl√≠citas. cuantificadores simples bien alineados. reformulaciones obvias

No sirven para: leyes f√≠sicas. consistencia cient√≠fica. ‚Äúesto viola un principio‚Äù contradicci√≥n por modelo del mundo

Si el objetivo es detectar incoherencias del tipo ‚Äúesto viola una ley‚Äù, necesitas una capa adicional:

Opci√≥n A ‚Äî NLI + reglas expl√≠citas

Ejemplo:

detectar patrones tipo
("regardless of weight" AND "fall at same speed")

y comparativos opuestos
("heavier" AND "higher speed")
‚Üí marcar contradicci√≥n por regla, no por NLI

Opci√≥n B ‚Äî LLM con razonamiento expl√≠cito

Un LLM grande, con prompt tipo:

‚ÄúAssume classical mechanics in vacuum. Are these two statements compatible?‚Äù

Eso ya no es NLI, es razonamiento guiado.

Opci√≥n C ‚Äî Formalizaci√≥n (si quieres ser extremo)

Traducir a l√≥gica:

‚àÄw‚ÇÅ,w‚ÇÇ : v(w‚ÇÅ)=v(w‚ÇÇ)

‚àÉw‚ÇÅ>w‚ÇÇ : v(w‚ÇÅ)>v(w‚ÇÇ)

Ah√≠ la contradicci√≥n es trivial, pero eso est√° fuera del alcance de NLI.