## Testing if model.base_model() and pipeline() give the same results

Here, they seem to do. However, there are version differences: I tried this on colab, where I used a different transformer version, and there were 1e-6 to 1e-8 differences.
Result is also that model.base_model() with torch.no_grad is sliglty faster.

In [1]:
import transformers
import torch
import numpy as np

In [2]:
MODEL_NAME="xlm-roberta-base"
tokenizer = transformers.AutoTokenizer.from_pretrained(MODEL_NAME)
model = transformers.AutoModelForPreTraining.from_pretrained(MODEL_NAME)#,low_cpu_mem_usage=True)

Some weights of the model checkpoint at xlm-roberta-base were not used when initializing XLMRobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing XLMRobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing XLMRobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [3]:
pipe = transformers.pipeline(task="feature-extraction",model=MODEL_NAME,tokenizer=MODEL_NAME, return_tensors=True)#,device=0)

In [4]:
text = ["this", "is", "a", "short", "text", ",", "also", "im", "testing", "speed"]

In [6]:
!date +'%T%3N'
with torch.no_grad():
    model_out2 = model.base_model(**tokenizer(text, return_tensors='pt', padding=True))["last_hidden_state"].detach()
model_output_list = [t.unsqueeze(0) for t in torch.unbind(model_out2, dim=0)]
embedded_pooled_model=[torch.mean(elem,axis=1).cpu() for elem in model_output_list]
!date +'%T%3N'

16:09:22547


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


16:09:22839


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [7]:
!date +'%T%3N'
embedded = pipe(text)
embedded_pooled_pipe=[torch.mean(elem,axis=1).cpu() for elem in embedded]
!date +'%T%3N'

16:09:28206


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


16:09:28761


In [8]:
print(embedded_pooled_model[0].size())
print(embedded_pooled_pipe[0].size())

torch.Size([1, 768])
torch.Size([1, 768])


In [9]:
for t1, t2 in zip(embedded_pooled_pipe, embedded_pooled_pipe):
    print(t1[0]==t2[0])

tensor([True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, True, True, True, True, True, True, True,
        True, True, True, True, True, Tr