## **Tasks-Code**

In [1]:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

In [5]:
# تابع عمومی برای تولید خروجی
def generate_output(model,tokenizer,input_text,prefix="",max_length=100,num_beams=4):
  inputs = tokenizer(prefix+input_text,return_tensors="pt",max_length=512,truncation=True)
  outputs=model.generate(**inputs,max_length=max_length,num_beams=num_beams)
  return tokenizer.decode(outputs[0],skip_special_tokens=True)

#### **1. Machine Translation** (ترجمه ماشین)

In [None]:
model_name = "Helsinki-NLP/opus-mt-en-de" # انگلیسی به آلمانی
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)



In [None]:
input_text = "Hello, how are you?"
print("translate : ", generate_output(model,tokenizer,input_text))

ترجمه :  Hallo, wie geht's?


#### **2. Summarization** (خلاصه سازی )

In [None]:
model_name = "facebook/bart-large-cnn"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [None]:
input_text = """Artificial Intelligence (AI) is a branch of computer science focused on creating systems capable of performing tasks that normally require human intelligence.
Using techniques like machine learning, these systems can learn from data, recognize patterns, and make decisions.
AI is now transforming major industries, from healthcare to finance, by automating processes and generating insights,
 making it one of the most significant technological forces of our time."""
print("summarize : ", generate_output(model,tokenizer,input_text,max_length=20))



خلاصه :  Artificial Intelligence (AI) is a branch of computer science focused on creating systems capable


#### **3. Question Answering** (پرسش و پاسخ )

In [None]:
model_name = "t5-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [None]:
input_text = "context: The capital of France is Paris. question: What is the capital of France?"
print("response : ", generate_output(model,tokenizer,input_text,prefix="qa: "))

response :  Paris


#### **4. Text Generation** ( تولید متن )

In [None]:
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [None]:
input_text = "generate a story about a cat"
print("generated text: ", generate_output(model,tokenizer,input_text,max_length=15,prefix="story: "))

generated text:  a cat is a cat that lives in a house.


#### **5. Text Paraphrasing** (بازنویسی متن )

In [None]:
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [None]:
input_text = "The weather is extremely cold today."
print( generate_output(model,tokenizer,input_text,prefix="paraphrase: "))

The weather is extremely cold today.


#### **6. Grammer Correction** ( اصلاح گرامری )

In [None]:
model_name = "google/t5-v1_1-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

tokenizer_config.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/605 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

In [None]:
input_text = "I am going to market for buy some fruits."
print( generate_output(model,tokenizer,input_text,prefix="grammar: "))

. I am going to market to buy some fruits. .


#### **7. Information Extraction** ( استخراج اطلاعات )

In [7]:
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

In [8]:
input_text = "John lives in New York and works at Google."
print( generate_output(model,tokenizer,input_text,prefix="extract: "))

John is a computer programmer.


#### **8. Code Generation**

In [27]:
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [28]:
input_text = "Write a Python function to calculate the factorial of a number."
print( generate_output(model,tokenizer,input_text,max_length= 500, prefix="Generate Python code for the following task: "))

n = int(input()) for i in range(1, n + 1): if n % i == 0: n = n // i else: n = n // i print(n)


#### **9. Text-to-Text Tasks**

In [35]:
model_name = "Helsinki-NLP/opus-mt-en-fr"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]



pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/301M [00:00<?, ?B/s]

In [36]:
input_text = "How are you?"
print( generate_output(model,tokenizer,input_text))

Comment allez-vous ?


#### **10. Text Infilling** ( حذف سانسورها )

In [40]:
model_name = "facebook/bart-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

In [43]:
input_text = "I am going to <mask> to buy some fruits."
print( generate_output(model,tokenizer,input_text,max_length=50))

I am going to the grocery store to buy some fruits.


---
## **Seq2SeqLM-Models**

| مدل (Encoder–Decoder) | معماری پایه                    | تسک‌های اصلی تولید متن                                                                    | مثال مدل‌های هاگینگ‌فیس                                    |
| --------------------- | ------------------------------ | ----------------------------------------------------------------------------------------- | ---------------------------------------------------------- |
| **T5 / Flan-T5**      | Encoder–Decoder  | همه‌چیز! ترجمه، خلاصه‌سازی، پرسش و پاسخ مولد، پارافرایز، data-to-text، general generation | `t5-small`, `google/flan-t5-base`                          |
| **BART**              | Encoder–Decoder                | Summarization, Abstractive QA, Dialogue, Paraphrasing                                     | `facebook/bart-large-cnn`, `facebook/bart-base`            |
| **Pegasus**           | Encoder–Decoder                | Summarization (long documents, news, scientific papers)                                   | `google/pegasus-xsum`, `google/pegasus-cnn_dailymail`      |
| **mBART**             | Encoder–Decoder                | Multilingual Summarization, Multilingual Translation, Generative QA چندزبانه              | `facebook/mbart-large-50`                                  |
| **MarianMT**          | Encoder–Decoder                | Translation (EN ↔ Other Languages)                                                        | `Helsinki-NLP/opus-mt-en-fr`, `Helsinki-NLP/opus-mt-en-de` |
| **ProphetNet**        | Encoder–Decoder                | Summarization, Paraphrasing, General conditional generation                               | `microsoft/prophetnet-large-uncased`                       |
