# Imports And Installations

In [1]:
import sys
import subprocess

def install(package):
    subprocess.check_call([sys.executable, "-m", "pip", "install", package])

install("transformers")
install("torch")

In [2]:
from transformers import pipeline, set_seed

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM

# Basic English Text Generation (GPT-2)

In [4]:
generator_en = pipeline("text-generation", model="gpt2")
set_seed(42)

generator_en(
    "Hello, I'm a language model,",
    max_length=30,
    num_return_sequences=5
)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_le

[{'generated_text': "Hello, I'm a language model, I'm a language model. In my mind, I'm doing the same thing as you. All these different people are thinking about the same thing.\n\nI'm not talking about what the computer program does. I'm talking about what I'm thinking about. I'm thinking about what my body is thinking about. I'm thinking about what I'm making it. I'm thinking about what I'm making it. What my body is doing. I'm thinking about what my body is thinking about. What's my body doing? What's my body doing? What's my body doing?\n\nLet's talk about this, I'm not a computer programmer. I'm not a software developer. I'm not a computer programmer. I'm not a language model. I'm not a language model. My body is thinking about what I'm doing. What is my body doing? What's my body doing? What's my body doing?\n\nNow, I'm not saying that this is a good thing. But if you want to get better at programming, you can get better at writing. You can get better at being human. You can get

# Question-style Prompt Generation

In [5]:
generator_en(
    "Can I ask you a question?",
    max_length=30,
    num_return_sequences=5
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'Can I ask you a question?\n\nDo you like to read about the work of the greats on both sides?\n\nDo you know what a wonderful job this website is to you?\n\nDo you have a problem with your browser?\n\nWhat is your favorite book about a great author?\n\nWhat is your favorite movie?\n\nWhat is your favorite book about a great storyteller?\n\nWhat is your favorite book about a great novelist?\n\nDo you have any questions?\n\nTell us what you think!\n\nWhat do you want to see in a book?\n\nWhat kind of books do you want to read?\n\nWhat does your favorite book about a great author do?\n\nWhat do you think about the work of a great writer?\n\nWhat do you think about the work of a great novelist?\n\nWhat is your favorite book about a great author?\n\nWhat does your favorite book about a great novelist do?\n\nWhat do you think about the work of a great writer?\n\nWhat do you think about the work of a great writer?\n\nWhat are your favorite books about a great author?\n\nWh

# Factual Prompt – Short Output

In [6]:
generator_en(
    "Albert Einstein is one of the most",
    max_length=30,
    num_return_sequences=5
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'Albert Einstein is one of the most famous mathematicians of our age. He used the term "computational" just once. It\'s a pretty amazing idea. What does that mean? Well, this is a great question and the answer is that it means that quantum mechanics is a theory of how information moves around space. So we don\'t know when information is coming in or out of, we don\'t know what it is. And so quantum mechanics is a theory of how information moves. It\'s a theory of how particles move around. It\'s a theory of the universe. And so it\'s a theory of what information moves around. It\'s a theory of how information moves around space. And so of course, we\'ll see in the next part of this article that we will see that if we apply it to the universe, we find that there is a very strong and strong evidence that we are in the same place as the universe.\n\nThis article is about the idea that information moves around the entire universe and that is the core of the theory of qu

# Long-form Text Generation

In [7]:
generator_en(
    "Albert Einstein is one of the most",
    max_length=200,
    num_return_sequences=3
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=200) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'Albert Einstein is one of the most influential and influential figures in modern philosophy.\n\nHe was born in Paris at the age of 18 and educated at a very young age in the school of the very famous physicist, Albert Einstein. He was educated at a very young age in the University of Cambridge and was a good student. He had a very good eye for mathematics, he was able to see that if an object were to be put into space it was going to be put into solid form by means of an air conditioner.\n\nHe had a very good sense of geometry, which he had studied at Cambridge, which was the University of Cambridge. He was also an excellent mathematician and had an excellent sense of mathematics. He was also a good friend of the philosopher Max Weber and a good fellow. He had a lot of ideas in his mind, and he had done experiments on his own.\n\nHe was also a good friend of the philosopher H.G. Wells. He was a great scientist and a great friend of the philosopher Albert N. Feynman

# Arabic Examples

# Arabic Prompt with English GPT-2

In [8]:
generator_en(
    "أريد أن أحكي لك عن موضوع هام",
    max_length=100,
    num_return_sequences=3
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'أريد أن أحكي لك عن موضوع هام الطلبية وجها فقبلا وحجها فقطر إلى الله شه، وسنيا أله إليه سنيا أن أن أحكي لك عن موضوع هام الطلبية وجها فقبلا وحجها فقطر إلى الله شه، وسنيا أله إليه سنيا أن أحكي لك عن موضوع هام الطلبية وجها فقبلا وحجها فقطر إلى الله شه، وسنيا أله إليه سنيا أن �'},
 {'generated_text': 'أريد أن أحكي لك عن موضوع هامول تطبعل تطبير بغعتائن من تأريد (الشعل أن أن أيع على الأحرية).\n\n1. The Quran says:\n\nThe Prophet (peace and blessings of Allaah be upon him) said: "If you are to believe in Allaah and the Messenger (peace and blessings of Allaah be upon him) and to follow the Messenger (peace and blessings of Allaah be upon him) and to follow the Messenger (peace and blessings of Allaah be upon him), then they should not use their hands (to go to a mosque) to enter it. I think that (Muslims) would not believe in such a thing (if they were) and they would not go to the mosque unless they made a good effort.\n\n2. The Qur\'an says:\n\nIbn Hajar said: "If Muslim

# Arabic Text Generation using Arabic GPT-2 (Pipeline)

In [9]:
generator_ar = pipeline(
    "text-generation",
    model="akhooli/gpt2-small-arabic"
)

generator_ar(
    "أريد أن أحكي لك عن موضوع هام",
    max_length=120,
    num_return_sequences=3
)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=120) seem to have been set. `max_new_tokens` will take precedence

[{'generated_text': 'أريد أن أحكي لك عن موضوع هام جدا من حيث النوع الاجتماعي، لذلك فهو يرى أن الدين هو سبب عدم التوافق بين الديانات. في المقابل، يرى أن الدين هو سبب عدم التوافق بين الأديان. كما أن الدين هو سبب عدم التوافق بين الأديان. كما أنه يرى أن الدين هو سبب عدم التوافق بين الأديان. أما في علم الأديان، فمصطلح الدين هو الأساس للخلاف بين الأديان. في حين أن الدين هو السبب الرئيسي للخلاف بين الأديان، فهو يرى أن الدين هو سبب عدم التوافق بين الأديان. في حين أن الدين هو مصدر الخلاف بين الأديان، فإنه يرى أن الدين هو سبب عدم التوافق بين الأديان. فمثلا، هناك بعض الدول التي لديها موقف قوي من الدين. في حين أن الدول التي لديها موقف قوي من الدين، والتي لديها موقف قوي من الدين، فإن الدول التي لديها موقف قوي من الدين، مثل الصين والهند، لديها موقف قوي من الدين. في حين أن الدول التي لديها موقف قوي من الدين، فإنها لا تملك موقف قوي من الدين. في حين أنه في حين أن الدول التي لديها موقف قوي من الدين، فإن الدول التي لديها موقف قوي من الدين، مثل تايلاند والهند، لديها موقف قوي من الدين. هذا يؤكد أن الدين هو

# Arabic Model – Manual Loading (Tokenizer + Model)

In [10]:
tokenizer_ar = AutoTokenizer.from_pretrained("akhooli/gpt2-small-arabic")
model_ar = AutoModelForCausalLM.from_pretrained("akhooli/gpt2-small-arabic")

generator_ar_manual = pipeline(
    "text-generation",
    model=model_ar,
    tokenizer=tokenizer_ar
)

generator_ar_manual(
    "التعلم الآلي أصبح جزءًا مهمًا من",
    max_length=120,
    num_return_sequences=3
)

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Both `max_new_tokens` (=256) and `max_length`(=120) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'التعلم الآلي أصبح جزءًا مهمًا من القانون الدولي، وهو القانون الذي صدر في عام 1964، وهو قانون خاص بالقانون الدولي. وفي عام 1976 بدأت البلدان التي تقع خارج نطاق القانون الدولي في تطبيق القانون الدولي على الدول التي تقع خارج نطاق القانون الدولي، مثل الدول ذات القانون العام. وفي عام 1980 كان هناك أيضا عدد قليل جدا من البلدان. وفي عام 1994 تم تطبيق القانون الدولي على الدول التي تقع خارج نطاق القانون الدولي، مثل المملكة المتحدة، حيث كانت الدول ذات القانون العام. وفي عام 1998 تم تطبيق القانون الدولي على الدول التي تقع خارج نطاق القانون الدولي، مثل المملكة المتحدة، حيث كانت الدول ذات القانون العام. وفي عام 2001 تم تطبيق القانون الدولي على الدول ذات القانون العام، مثل المملكة المتحدة، حيث كانت معظم الدول ذات القانون العام. وفي عام 2006 تم تطبيق القانون الدولي على الدول ذات القانون العام، مثل المملكة المتحدة، حيث كانت بعض الدول ذات القانون العام. وفي عام 2007 تم تطبيق القانون الدولي على الدول ذات القانون العام. وفي عام 2008 تم تطبيق القانون الدولي على دول ذات القانون العام، 

# Reproducibility Check with Seed

In [11]:
set_seed(42)

generator_en(
    "Machine learning will change",
    max_length=40,
    num_return_sequences=2
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=40) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'Machine learning will change the way we think about our jobs and how we spend our money.\n\nSo what are your thoughts? Let me know below!'},
 {'generated_text': 'Machine learning will change the way we think about science, and it will change how we think about science," he said.\n\nThe goal of the new research, which was conducted at MIT, is to understand why we think of science as a process, not an object, he said.\n\n"We want to understand how we think about the world," he said, "and we want to understand how we think about the universe."'}]