# Model Denemeleri

### Önce HuggingFace'ten, hali hazırda tamamen türkçe veriler üzerinde eğitilmiş BerTurk modeli denenecek

In [1]:
from transformers import AutoTokenizer, BertForPreTraining
import torch

tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased")
model = BertForPreTraining.from_pretrained("dbmdz/bert-base-turkish-cased")

In [2]:
sd_hf = model.state_dict()

for k, v in sd_hf.items():
    print(k, v.shape)

bert.embeddings.word_embeddings.weight torch.Size([32000, 768])
bert.embeddings.position_embeddings.weight torch.Size([512, 768])
bert.embeddings.token_type_embeddings.weight torch.Size([2, 768])
bert.embeddings.LayerNorm.weight torch.Size([768])
bert.embeddings.LayerNorm.bias torch.Size([768])
bert.encoder.layer.0.attention.self.query.weight torch.Size([768, 768])
bert.encoder.layer.0.attention.self.query.bias torch.Size([768])
bert.encoder.layer.0.attention.self.key.weight torch.Size([768, 768])
bert.encoder.layer.0.attention.self.key.bias torch.Size([768])
bert.encoder.layer.0.attention.self.value.weight torch.Size([768, 768])
bert.encoder.layer.0.attention.self.value.bias torch.Size([768])
bert.encoder.layer.0.attention.output.dense.weight torch.Size([768, 768])
bert.encoder.layer.0.attention.output.dense.bias torch.Size([768])
bert.encoder.layer.0.attention.output.LayerNorm.weight torch.Size([768])
bert.encoder.layer.0.attention.output.LayerNorm.bias torch.Size([768])
bert.encoder

In [3]:
torch.equal(sd_hf["bert.embeddings.word_embeddings.weight"], sd_hf["cls.predictions.decoder.weight"]) # tying kontrolü

True

In [4]:
sd_hf["bert.embeddings.word_embeddings.weight"].data_ptr() == sd_hf["cls.predictions.decoder.weight"].data_ptr()  # tying kontrolü

True

In [5]:
torch.equal(sd_hf["cls.predictions.decoder.bias"], sd_hf["cls.predictions.bias"]) # tying kontrolü

True

In [6]:
sd_hf["cls.predictions.decoder.bias"].data_ptr() == sd_hf["cls.predictions.bias"].data_ptr()  # tying kontrolü

True

In [7]:
inputs = tokenizer("Selamlar efendim", return_tensors="pt")
outputs = model(**inputs)

In [8]:
outputs.prediction_logits.shape, outputs.seq_relationship_logits.shape

(torch.Size([1, 4, 32000]), torch.Size([1, 2]))

In [9]:
from transformers import pipeline

fill_masker = pipeline(task="fill-mask", model="dbmdz/bert-base-turkish-cased")
fill_masker("Merhaba [MASK] efendi nasıl gidiyor?")

BertForMaskedLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'score': 0.05243540555238724,
  'token': 3944,
  'token_str': 'Mehmet',
  'sequence': 'Merhaba Mehmet efendi nasıl gidiyor?'},
 {'score': 0.03593067452311516,
  'token': 7709,
  'token_str': 'hanım',
  'sequence': 'Merhaba hanım efendi nasıl gidiyor?'},
 {'score': 0.03291669860482216,
  'token': 3024,
  'token_str': 'bey',
  'sequence': 'Merhaba bey efendi nasıl gidiyor?'},
 {'score': 0.031684163957834244,
  'token': 7983,
  'token_str': 'hoca',
  'sequence': 'Merhaba hoca efendi nasıl gidiyor?'},
 {'score': 0.030081957578659058,
  'token': 10997,
  'token_str': 'Salih',
  'sequence': 'Merhaba Salih efendi nasıl gidiyor?'}]

In [10]:
from model import BertForPreTraining, BertConfig

%load_ext autoreload
%autoreload 2

In [11]:
default_config = BertConfig()
custom_model = BertForPreTraining(default_config)

In [12]:
custom_sd = custom_model.state_dict()

assert len(custom_sd) == len(sd_hf)

for custom, hf in zip(custom_sd.items(), sd_hf.items()):
    k, v = custom
    k_hf, v_hf = hf
    assert k == k_hf, f"{k} : {k_hf}"
    assert v.shape == v_hf.shape, f"{k} : {v.shape} != {v_hf.shape}"
    print(k, v.shape, "--------", k_hf, v_hf.shape)

bert.embeddings.word_embeddings.weight torch.Size([32000, 768]) -------- bert.embeddings.word_embeddings.weight torch.Size([32000, 768])
bert.embeddings.position_embeddings.weight torch.Size([512, 768]) -------- bert.embeddings.position_embeddings.weight torch.Size([512, 768])
bert.embeddings.token_type_embeddings.weight torch.Size([2, 768]) -------- bert.embeddings.token_type_embeddings.weight torch.Size([2, 768])
bert.embeddings.LayerNorm.weight torch.Size([768]) -------- bert.embeddings.LayerNorm.weight torch.Size([768])
bert.embeddings.LayerNorm.bias torch.Size([768]) -------- bert.embeddings.LayerNorm.bias torch.Size([768])
bert.encoder.layer.0.attention.self.query.weight torch.Size([768, 768]) -------- bert.encoder.layer.0.attention.self.query.weight torch.Size([768, 768])
bert.encoder.layer.0.attention.self.query.bias torch.Size([768]) -------- bert.encoder.layer.0.attention.self.query.bias torch.Size([768])
bert.encoder.layer.0.attention.self.key.weight torch.Size([768, 768]) -

In [13]:
model_from_hf = BertForPreTraining.from_pretrained()
mm = model_from_hf.state_dict()

loading weights from pretrained bert: dbmdz/bert-base-turkish-cased


In [14]:
from transformers import pipeline


fill_masker = pipeline(task="fill-mask", model="dbmdz/bert-base-turkish-cased")
fill_masker("Merhaba [MASK] efendi nasıl gidiyor?") 

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'score': 0.05243540555238724,
  'token': 3944,
  'token_str': 'Mehmet',
  'sequence': 'Merhaba Mehmet efendi nasıl gidiyor?'},
 {'score': 0.03593067452311516,
  'token': 7709,
  'token_str': 'hanım',
  'sequence': 'Merhaba hanım efendi nasıl gidiyor?'},
 {'score': 0.03291669860482216,
  'token': 3024,
  'token_str': 'bey',
  'sequence': 'Merhaba bey efendi nasıl gidiyor?'},
 {'score': 0.031684163957834244,
  'token': 7983,
  'token_str': 'hoca',
  'sequence': 'Merhaba hoca efendi nasıl gidiyor?'},
 {'score': 0.030081957578659058,
  'token': 10997,
  'token_str': 'Salih',
  'sequence': 'Merhaba Salih efendi nasıl gidiyor?'}]

In [15]:
from model import FillMaskPipeline, IsNextPipeline

fill_mask = FillMaskPipeline(model=model_from_hf, tokenizer=tokenizer)

In [16]:
fill_mask(["Merhaba [MASK] efendi nasıl gidiyor?"])

Text: Merhaba [MASK] efendi nasıl gidiyor? ---> Top 5 Predictions: {'token_str': ['Hasan', 'Burak', 'Selim', 'Serdar', 'doktor'], 'score': ['0.007', '0.007', '0.003', '0.009', '0.030']}


In [17]:
fill_mask(["Merhaba [MASK] efendi nasıl gidiyor?", "ışınlanma teknolojisi [MASK] gitmektedir."])

Text: Merhaba [MASK] efendi nasıl gidiyor? ---> Top 5 Predictions: {'token_str': ['hacı', 'hanım', 'Mustafa', 'Mehmet', 'bey'], 'score': ['0.007', '0.051', '0.023', '0.039', '0.036']}
Text: ışınlanma teknolojisi [MASK] gitmektedir. ---> Top 5 Predictions: {'token_str': ['hoşuna', 'önde', 'buradan', 'üzerinden', 'ileriye'], 'score': ['0.021', '0.088', '0.006', '0.047', '0.012']}


In [18]:
fill_mask_custom = FillMaskPipeline(model=BertForPreTraining(BertConfig()), tokenizer=tokenizer)    # random initialized, so garbage output expected

In [19]:
fill_mask_custom(["Merhaba [MASK] efendi nasıl gidiyor?", "ışınlanma teknolojisi [MASK] gitmektedir."]) 

Text: Merhaba [MASK] efendi nasıl gidiyor? ---> Top 5 Predictions: {'token_str': ['Hek', 'örnek', 'edildi', 'çır', '##witch'], 'score': ['0.000', '0.000', '0.000', '0.000', '0.000']}
Text: ışınlanma teknolojisi [MASK] gitmektedir. ---> Top 5 Predictions: {'token_str': ['isteğe', 'İran’ın', 'bölünmüş', 'dedikodu', 'Medres'], 'score': ['0.000', '0.000', '0.000', '0.000', '0.000']}


In [20]:
nsp_pipeline = IsNextPipeline(model=model_from_hf, tokenizer=tokenizer)

In [33]:
nsp_pipeline([["Çocuk sahibi çift sayısında inanılmaz bir artış var.", "Uzaya ilk Bekiroğ Reis çıkmıştır"]])

Text: ['Çocuk sahibi çift sayısında inanılmaz bir artış var.', 'Uzaya ilk Bekiroğ Reis çıkmıştır'] Predictions ----> isNext: 0.072, notNext: 0.928
