To ensure correct text generation,  
we use **only one tokenizer (`tok_fast`)** ‚Äî the same one used during training.

This prevents vocabulary mismatch and encoding errors (such as ‚Äú√ê‚Ä¶‚Äù).  
The model is loaded from checkpoints and its embeddings are resized  
to match the tokenizer‚Äôs vocabulary.


In [None]:
from transformers import GPT2LMHeadModel, PreTrainedTokenizerFast

# tokenizer
TOK_JSON = '/content/tokenizer_bpe/bpe_16000/tokenizer.json'
tok_fast = PreTrainedTokenizerFast(
    tokenizer_file=TOK_JSON,
    bos_token="<bos>", eos_token="<eos>", unk_token="<unk>", pad_token="<pad>"
)

# load model
mdl = GPT2LMHeadModel.from_pretrained(str(CKPT_DIR)).to(device)
mdl.eval()

if tok_fast.pad_token is None:
    tok_fast.add_special_tokens({"pad_token": "<pad>"})
if tok_fast.eos_token is None:
    tok_fast.add_special_tokens({"eos_token": "<eos>"})

mdl.resize_token_embeddings(len(tok_fast))
mdl.config.pad_token_id = tok_fast.pad_token_id
mdl.config.eos_token_id = tok_fast.eos_token_id

#little test
probe = "–ö“Ø“£–µ–ª–ª–µ “ª”ô–º –±”ô—Ö–µ—Ç–ª–µ –∫”©–Ω"
ids = tok_fast(probe)["input_ids"]
print("RT:", tok_fast.decode(ids, skip_special_tokens=True))

# band number token list
vocab = tok_fast.get_vocab()
digit_ids = [[tid] for tok, tid in vocab.items() if any(ch.isdigit() for ch in tok)]

# generate func
def generate(prompt: str, max_new_tokens=80):
    enc = tok_fast(prompt, return_tensors="pt")
    with torch.no_grad():
        out = mdl.generate(
            input_ids=enc.input_ids.to(device),
            attention_mask=enc.attention_mask.to(device),
            max_new_tokens=max_new_tokens,
            num_beams=4,
            early_stopping=True,
            no_repeat_ngram_size=3,
            repetition_penalty=1.2,
            length_penalty=0.9,
            bad_words_ids=digit_ids or None,
            pad_token_id=tok_fast.pad_token_id,
            eos_token_id=tok_fast.eos_token_id,
        )
    text = tok_fast.decode(out[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
    text = re.sub(r"\d+", " ", text)
    text = re.sub(r"\s+", " ", text).strip()
    return text



RT: –ö“Ø“£–µ–ª–ª–µ “ª”ô–º –±”ô—Ö–µ—Ç–ª–µ –∫”©–Ω


The model was tested on several Tatar-language questions.
It produced coherent and contextually relevant answers ‚Äî although it did not respond directly to the questions but rather in a news-like manner ‚Äî showing that it learned to "speak" Tatar after being trained from scratch.

# QA Demonstration

In [None]:
p = [
    "–†”ô“Ø—Ñ —Ö”ô–∑—Ä”ô—Ç –ì–∞–π–Ω–µ—Ç–¥–∏–Ω –Ω”ô—Ä—Å”ô —ç—à–ª–∏?"]#What does Rauf Khazrat Gainetdin do?

print("Q:", p)
print("A:", generate(p, max_new_tokens=60))
print("------")


Q: ['–†”ô“Ø—Ñ —Ö”ô–∑—Ä”ô—Ç –ì–∞–π–Ω–µ—Ç–¥–∏–Ω –Ω”ô—Ä—Å”ô —ç—à–ª–∏?']
A: –†”ô“Ø—Ñ —Ö”ô–∑—Ä”ô—Ç –ì–∞–π–Ω–µ—Ç–¥–∏–Ω –Ω”ô—Ä—Å”ô —ç—à–ª–∏?‚Äú–ê—á—ã–∫ –∏—à–µ–∫–ª”ô—Ä –∫”©–Ω–µ‚Äù –∫—ã—Å–∞–ª–∞—Ä—ã–Ω–¥–∞ ‚Äú–ò–º–∞–Ω‚Äù –±–∞–ª–∞–ª–∞—Ä-—Å–ø–æ—Ä—Ç –∫–æ–º–ø–ª–µ–∫—Å—ã–Ω–¥–∞, —à—É–ª–∞–π —É–∫ ‚Äú–ó”©—è –ö–∞–∑–∞–Ω—å‚Äù —Ö–∞–ª—ã–∫–∞—Ä–∞ –∞—ç—Ä–æ–ø–æ—Ä—Ç—ã —Ö–µ–∑–º”ô—Ç–∫”ô—Ä–ª”ô—Ä–µ ”©—á–µ–Ω –¥”ô –æ–µ—à—Ç—ã—Ä—ã–ª–≥–∞–Ω –∏–¥–µ.
------


Translate:

Q: 'What does Rauf Khazrat Gainetdin do?'

A: What does Rauf Khazrat Gainetdin do? He organize the framework of the 'Open Doors Day' at the 'Iman' children's sports complex, as well as for the staff of 'Zoya Kazan' international airport.

In [None]:
p = [

    "–¢–† –î”ô“Ø–ª”ô—Ç –°–æ–≤–µ—Ç—ã –†”ô–∏—Å–µ —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.",  # Write briefly about the President of the State Council of the Republic of Tatarstan.
]
print("Q:", p)
print("A:", generate(p, max_new_tokens=60))
print("------")


Q: ['–¢–† –î”ô“Ø–ª”ô—Ç –°–æ–≤–µ—Ç—ã –†”ô–∏—Å–µ —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.']
A: –¢–† –î”ô“Ø–ª”ô—Ç –°–æ–≤–µ—Ç—ã –†”ô–∏—Å–µ —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–ú–∞—Ç–±—É–≥–∞—Ç –∫–æ–Ω—Ñ–µ—Ä–µ–Ω—Ü–∏—è—Å–µ–Ω–¥”ô –¢–† –ü—Ä–µ–º—å–µ—Ä-–º–∏–Ω–∏—Å—Ç—Ä—ã –†”©—Å—Ç”ô–º –ú–∏“£–Ω–µ—Ö–∞–Ω–æ–≤, –¢–† –ü—Ä–µ–∑–∏–¥–µ–Ω—Ç—ã –ú–∏–Ω—Ç–∏–º–µ—Ä –®”ô–π–º–∏–µ–≤ “ª”ô–º –¢–† –î”ô“Ø–ª”ô—Ç –°–æ–≤–µ—Ç—ã –¥–µ–ø—É—Ç–∞—Ç—ã –ú–∞—Ä–∞—Ç –°–∞—Ñ–∏—É–ª–ª–∏–Ω –∫–∞—Ç–Ω–∞—à—Ç—ã.
------


Translate:

Q: 'Write briefly about the President of the State Council of the Republic of Tatarstan.'

A: Write briefly about the President of the State Council of the Republic of Tatarstan. The press conference was attended by the Prime Minister of the Republic of Tatarstan Rustam Minnikhanov, the President of the Republic of Tatarstan Mintimer Shaimiev, and the Deputy of the State Council of the Republic of Tatarstan Marat Safiullin.

# Mini-benchmark

I also implemented a mini-benchmark with four different formulations of the question ‚ÄúWho is the mayor of Kazan?‚Äù ‚Äî the model gives the same answer regardless of how the question is phrased.

In [None]:
prompts = [
    "–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –∫–µ–º?",  # Who is the mayor of Kazan?
    "–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.",  # Briefly introduce the mayor of Kazan
    "–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã ‚Äî —É–ª –∫–µ–º?",  # Mayor of Kazan-who is he?
    "–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ ”©—á “ó”©–º–ª”ô —è–∑.",  # Introduce the mayor of Kazan in three sentences.
]

for p in prompts:
    print("Q:", p)
    print("A:", generate(p, max_new_tokens=30))
    print("------")


Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –∫–µ–º?
A: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –∫–µ–º?–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô.
------
Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.
A: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô.
------
Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã ‚Äî —É–ª –∫–µ–º?
A: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã ‚Äî —É–ª –∫–µ–º?–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô.
------
Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ ”©—á “ó”©–º–ª”ô —è–∑.
A: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ ”©—á “ó”©–º–ª”ô —è–∑.–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–

I also implemented two automatic evaluation metrics: notEcho and kwCover.
| Metric      | Meaning                                                          |
| ----------- | ---------------------------------------------------------------- |
| **notEcho** | Proportion of answers that do not simply repeat the question |
| **kwCover** | Coverage of key informational words (e.g., ‚Äú–ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω‚Äù)       |


In [None]:
# Kazan Mayor Mini-Benchmark
import re, torch, numpy as np

torch.manual_seed(42)




def not_echo(q, a):
    return q.strip().lower() not in a.strip().lower()

def has_keywords(a, kws):
    a_low = a.lower()
    return sum(1 for k in kws if k in a_low) / max(1, len(kws))

facts_kws = ["–∏–ª—Å—É—Ä", "–º–µ—Ç—à–∏–Ω", "–∫–∞–∑–∞–Ω", "–º—ç—Ä"]

answers=["–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –∫–µ–º?–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô."
"–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô."
"–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã ‚Äî —É–ª –∫–µ–º?–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô."
"–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ ”©—á “ó”©–º–ª”ô —è–∑.–ë—É —Ö–∞–∫—Ç–∞ ‚Äú–¢–∞—Ç–∞—Ä-–∏–Ω—Ñ–æ—Ä–º‚Äù –ú–ê —Ö”ô–±”ô—Ä—á–µ—Å–µ–Ω”ô –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã –ò–ª—Å—É—Ä –ú–µ—Ç—à–∏–Ω —Ö”ô–±”ô—Ä –∏—Ç”ô."]
scores = {
    "notEcho": np.mean([not_echo(q, a) for q, a in zip(prompts, answers)]),
    "kwCover": np.mean([has_keywords(a, facts_kws) for a in answers]),
}

print("\n=== üìä Summary (mini-benchmark) ===")
print({k: round(v, 2) for k, v in scores.items()})



=== üìä Summary (mini-benchmark) ===
{'notEcho': 0.0, 'kwCover': 1.0}


Although the model tends to repeat the beginning of the question (notEcho = 0),
it achieves 100% coverage of factual keywords.

In [None]:

def eval_tok_path(tok_path):
    tok_tmp = PreTrainedTokenizerFast(tokenizer_file=tok_path, bos_token="<bos>", eos_token="<eos>", unk_token="<unk>", pad_token="<pad>")
    sample = sample_lines(CLEAN_FILE, 1000)
    lens=[]
    for s in sample:
        enc = tok_tmp(s, return_tensors=None)
        lens.append(len(enc["input_ids"]))
    return np.mean(lens)

rows=[]
for vs, p in TOK_RUNS.items():
    rows.append((vs, round(eval_tok_path(p),2)))
rows = sorted(rows)
print("Vocab | avg_tokens_per_sentence")
for r in rows: print(r)


Vocab | avg_tokens_per_sentence
(8000, 27.37)
(16000, 24.3)
(32000, 22.22)


# Ablation Experiment (max_new_tokens)

I tested max_new_tokens = 40 / 80 / 160.
Short outputs (40) were more focused, while longer ones (80‚Äì160) resembled news-style paragraphs, often repeating the prompt or drifting off-topic.

All of these variants produced worse factual and contextual accuracy than the default parameters.

In [None]:
# ===== Ablation experiment:
for mnt in [40, 80, 160]:
    print(f"\n=== max_new_tokens={mnt} ===")
    print("Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.")
    print("A:", generate("–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.", max_new_tokens=mnt))



=== max_new_tokens=40 ===
Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.
A: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–ê–Ω—ã“£ —Å“Ø–∑–ª”ô—Ä–µ–Ω”ô –∫–∞—Ä–∞–≥–∞–Ω–¥–∞, –¢–∞—Ç–∞—Ä—Å—Ç–∞–Ω –†–µ—Å–ø—É–±–ª–∏–∫–∞—Å—ã–Ω–¥–∞ ‚Äú –°”ô–ª–∞–º”ô—Ç–ª–µ–∫ ‚Äù –∏–ª–∫“Ø–ª”ô–º –ø—Ä–æ–µ–∫—Ç—ã –∫—ã—Å–∞–ª–∞—Ä—ã–Ω–¥–∞ —Ä–µ—Å–ø—É–±–ª–∏–∫–∞–Ω—ã“£ –±–∞—Ä–ª—ã–∫ –ø—Ä–µ–¥–ø—Ä–∏—è—Ç–∏–µ–ª”ô—Ä–µ “ª”ô–º –∞–≤—ã–ª —Ö—É“ó–∞–ª—ã–≥—ã –ø—Ä–æ–¥—É–∫—Ü–∏—è—Å–µ “ó–∏—Ç–µ—à—Ç–µ—Ä“Ø –±—É–µ–Ω—á–∞ —Ñ–µ–¥–µ—Ä–∞–ª—å “Ø–∑”ô–∫–ª”ô—à—Ç–µ—Ä–µ–ª–≥”ô–Ω.

=== max_new_tokens=80 ===
Q: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.
A: –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–ß–∞—Ä–∞–Ω—ã –¢–† –ü—Ä–µ–∑–∏–¥–µ–Ω—Ç—ã –ú–∏–Ω—Ç–∏–º–µ—Ä –®”ô–π–º–∏–µ–≤, ‚Äú –†–æ—Å—Å–∏—è –≥—Ä–∞–∂–¥–∞–Ω–Ω–∞—Ä—ã–Ω–∞ ‚Äì —è“£–∞ —Ç–µ—Ö–Ω–æ–ª–æ–≥–∏—è–ª”ô—Ä ‚Äù –ø—Ä–æ–µ–∫—Ç—ã–Ω –≥–∞–º”ô–ª–≥”ô –∞—à—ã—Ä—É –∫—ã—Å–∞–ª–∞—Ä—ã–Ω–¥–∞ –ö–∞–∑–∞–Ω –º—É–Ω–∏—Ü–∏–ø–∞–ª—å –±–µ—Ä”ô–º–ª–µ–≥

Translate:

=== max_new_tokens=40 ===

Q: Write briefly about the Mayor of Kazan.

A: Write briefly about the Mayor of Kazan. According to him, under the national project "Health" in the Republic of Tatarstan, all enterprises and agricultural production in the republic are centrally managed by the federal government.

=== max_new_tokens=80 ===

Q: Write briefly about the Mayor of Kazan.

A: Write briefly about the Mayor of Kazan. The event was presented by Tatarstan President Mintimer Shaimiev and, within the framework of the project "New Technologies for Russian Citizens," by the Deputy Chairman of the Executive Committee of the Kazan Municipality, Ravil Zaripov, and Tatarstan President Rustam Minnikhanov.

=== max_new_tokens=160 ===

Q: Write briefly about the Mayor of Kazan.

A: Write briefly about the Mayor of Kazan. Yesterday, together with the "Tatar-inform" news agency, he visited "Ak Bars" to come to "Bala".


**Conclusion:**

The model can sustain long text generation, indicating good contextual continuity,
but the default configuration still yields the most accurate and relevant results.

# Decoding Comparison (Beam Search vs Top-p Sampling)

| Strategy           | Expected Characteristics            | Actual Observation (Based on Outputs)                                                                                                                             |
| ------------------ | ----------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Beam Search**    | Deterministic, syntactically stable | Tends to **repeat the prompt** and output generic, incomplete sentences without key facts (e.g., *‚Äú–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑. ... –ø—Ä–æ—Ü–µ–Ω—Ç–∫–∞ –∫“Ø–±—Ä”ô–∫.‚Äù*) |
| **Top-p Sampling** | More diverse and creative           | Produces **news-like, verbose sentences** containing irrelevant entities and drifting away from the topic (e.g., *‚Äú... –£–ª—å—è–Ω–æ–≤—Å–∫–∏ ... –∏“ó—Ç–∏–º–∞–≥—ã–π –æ–µ—à–º–∞–ª–∞—Ä ...‚Äù*)   |


In [None]:
import re, torch

def generate_beam(prompt, max_new_tokens=60):
    """Beam Search ÁîüÊàê"""
    enc = tok_fast(prompt, return_tensors="pt").to(device)
    with torch.no_grad():
        out = mdl.generate(
            **enc,
            max_new_tokens=max_new_tokens,
            num_beams=4,           # Beam Search
            early_stopping=True,
            no_repeat_ngram_size=3,
            repetition_penalty=1.2,
            pad_token_id=tok_fast.pad_token_id,
            eos_token_id=tok_fast.eos_token_id,
        )
    text = tok_fast.decode(out[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
    text = re.sub(r"\d+", " ", text)
    text = re.sub(r"\s+", " ", text).strip()
    return text


def generate_top_p(prompt, max_new_tokens=60, temperature=0.9, top_p=0.9):
    """Top-p / Nucleus Sampling ÁîüÊàê"""
    enc = tok_fast(prompt, return_tensors="pt").to(device)
    with torch.no_grad():
        out = mdl.generate(
            **enc,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=temperature,
            top_p=top_p,
            top_k=50,
            no_repeat_ngram_size=3,
            pad_token_id=tok_fast.pad_token_id,
            eos_token_id=tok_fast.eos_token_id,
        )
    text = tok_fast.decode(out[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
    text = re.sub(r"\d+", " ", text)
    text = re.sub(r"\s+", " ", text).strip()
    return text


#
prompt = "–ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑."

beam_ans = generate_beam(prompt)
topp_ans = generate_top_p(prompt)

print("\n[Beam Search]\n", beam_ans)
print("\n[Top-p Sampling]\n", topp_ans)



[Beam Search]
 –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–ê–Ω—ã“£ —Å“Ø–∑–ª”ô—Ä–µ–Ω”ô –∫–∞—Ä–∞–≥–∞–Ω–¥–∞, —É–∑–≥–∞–Ω –µ–ª–Ω—ã“£ —à—É–ª —á–æ—Ä—ã –±–µ–ª”ô–Ω —á–∞–≥—ã—à—Ç—ã—Ä–≥–∞–Ω–¥–∞, , –ø—Ä–æ—Ü–µ–Ω—Ç–∫–∞ –∫“Ø–±—Ä”ô–∫.

[Top-p Sampling]
 –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –º—ç—Ä—ã —Ç—É—Ä—ã–Ω–¥–∞ –∫—ã—Å–∫–∞—á–∞ —è–∑.–£–ª—å—è–Ω–æ–≤—Å–∫–∏ ”©–ª–∫”ô—Å–µ–Ω–µ“£ ‚Äú–•”ô—Ä”ô–∫”ô—Ç‚Äù “ó”ô–º–≥—ã—è—Ç–µ “ó–∏—Ç”ô–∫—á–µ—Å–µ “ª”ô–º –ö–∞–∑–∞–Ω —à”ô“ª”ô—Ä–µ –±–∞—à–∫–∞—Ä–º–∞ –∫–æ–º–∏—Ç–µ—Ç—ã–Ω—ã“£ —è—à—å–ª”ô—Ä-—Ä–∞–π–æ–Ω–Ω–∞—Ä—ã–Ω–Ω–∞–Ω, –∏“ó—Ç–∏–º–∞–≥—ã–π –æ–µ—à–º–∞–ª–∞—Ä–Ω—ã“£ “ó–∏—Ç”ô–∫—á–µ–ª”ô—Ä–µ, —Ä–µ—Å–ø—É–±–ª–∏–∫–∞–Ω—ã“£ –±–∞—à–∫–∞ –º—É–Ω–∏—Ü–∏–ø–∞–ª—å –±–µ—Ä”ô–º–ª–µ–∫–ª”ô—Ä –±–∞—à–ª—ã–∫–ª–∞—Ä—ã –¥–∞ –±—É–ª–¥—ã.


Translate:

[Beam Search]Write a brief note about the Mayor of Kazan. According to his words, compared to the same period last year, it increased by a certain percent.

[Top-p Sampling]Write a brief note about the Mayor of Kazan. The head of the ‚ÄúMovement‚Äù society in Ulyanovsk region, as well as representatives from youth districts of the Kazan city executive committee, leaders of public organizations, and heads of other municipal entities of the republic, were also present.

**Conclusion**:

Both decoding strategies performed worse than the default setup:

- Beam Search gave short but uninformative answers, repeating the question.

- Top-p Sampling generated longer yet off-topic text.
Thus, keeping the default decoding parameters remains the most reliable option.