**Initialization**
- I use these three lines of code on top of my each notebooks because it will help to prevent any problems while reloading the same project. And the third line of code helps to make visualization within the notebook.

In [21]:
#@ INITIALIZATION: 
%reload_ext autoreload
%autoreload 2
%matplotlib inline

**Downloading Libraries and Dependencies**
- I have downloaded all the libraries and dependencies required for the project in one particular cell.

In [22]:
#@ IMPORTING MODULES: UNCOMMENT BELOW:
# !pip install transformers[sentencepiece]
import torch
import pandas as pd
import numpy as np
import torch.nn.functional as F
from transformers import AutoTokenizer
from transformers import AutoModelForCausalLM

#@ IGNORING WARNINGS: 
import warnings
warnings.filterwarnings("ignore")

**Greedy Search Decoding**
- The simplest decoding method to get discrete tokens from a model's continuous output is to greedily select the token with the highest probability at each timestamp.

In [23]:
#@ GREEDY SEARCH DECODING:
device = "cuda" if torch.cuda.is_available() else "cpu"             # Initializing gpu.
model_name = "gpt2"                                                 # Initializing model name.
tokenizer = AutoTokenizer.from_pretrained(model_name)               # Initializing pretrained tokenizer.
model = AutoModelForCausalLM.from_pretrained(model_name).to(device) # Initializing pretrained model.

In [24]:
#@ INITIALIZING GREEDY SEARCH DECODING:
input_txt = "HuggingFace is one of the"                                         # Initialization. 
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"].to(device)   # Initializing input ids tensors.
iterations = []                                                                 # Initialization.
n_steps = 8                                                                     # Initialization.
choices_per_step = 5                                                            # Initialization. 
with torch.no_grad():
    for _ in range(n_steps):
        iteration = dict()                                                      # Initializing dictionary.
        iteration["Input"] = tokenizer.decode(input_ids[0])                     # Adding.
        output = model(input_ids=input_ids)                                     # Implementation of model.
        next_token_logits = output.logits[0, -1, :]                             # Fist and last batch of logits.
        next_token_probs = torch.softmax(next_token_logits, dim=-1)             # Applying softmax layer.
        sorted_ids = torch.argsort(next_token_probs, dim=-1, descending=True)   # Sorting probabilities.
        for choice_idx in range(choices_per_step):
            token_id = sorted_ids[choice_idx]                                   # Initializing token index.
            token_prob = next_token_probs[token_id].cpu().numpy()               # Initializing token probabilities. 
            token_choice = (
                f"{tokenizer.decode(token_id)} ({100 * token_prob: .2f}%)"
            )
            iteration[f"Choice {choice_idx + 1}"] = token_choice                # Adding.
        input_ids = torch.cat([input_ids, sorted_ids[None, 0, None]], dim=-1)   # Appending predicted token to input.
        iterations.append(iteration)                                            # Appending predicted token to input.
pd.DataFrame(iterations)                                                        # Inspection.

Unnamed: 0,Input,Choice 1,Choice 2,Choice 3,Choice 4,Choice 5
0,HuggingFace is one of the,most ( 25.67%),best ( 7.91%),few ( 6.19%),more ( 4.02%),many ( 2.15%)
1,HuggingFace is one of the most,popular ( 15.46%),famous ( 3.31%),powerful ( 2.75%),important ( 2.56%),common ( 2.36%)
2,HuggingFace is one of the most popular,and ( 4.34%),characters ( 2.14%),videos ( 2.02%),games ( 1.66%),YouTube ( 1.16%)
3,HuggingFace is one of the most popular and,popular ( 9.46%),well ( 9.21%),most ( 3.76%),successful ( 2.19%),highly ( 2.06%)
4,HuggingFace is one of the most popular and pop...,videos ( 1.40%),games ( 1.35%),mods ( 1.21%),characters ( 1.12%),video ( 1.04%)
5,HuggingFace is one of the most popular and pop...,on ( 37.48%),of ( 17.71%),in ( 6.92%),from ( 3.37%),for ( 2.92%)
6,HuggingFace is one of the most popular and pop...,YouTube ( 54.18%),the ( 12.62%),Youtube ( 5.43%),Facebook ( 2.63%),youtube ( 1.83%)
7,HuggingFace is one of the most popular and pop...,. ( 43.00%),", ( 15.75%)",and ( 7.16%),right ( 3.17%),today ( 2.78%)


In [25]:
#@ GENERATING SEQUENCE OF TOKENS:
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"].to(device)   # Initializing input ids.
output = model.generate(input_ids, max_new_tokens=n_steps, do_sample=False)     # Generating sequence of tokens. 
print(tokenizer.decode(output[0]))                                              # Inspecting output.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


HuggingFace is one of the most popular and popular videos on YouTube.


In [26]:
#@ GENERATING LONG SEQUENCE OF TEXT:
max_length = 100                                                                # Initialization. 
input_txt = """Machine learning is the process of increasing the intelligence \
of computers on performing a certain task."""                                   # Input text example.
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"].to(device)   # Initializing input ids.
output_greedy = model.generate(input_ids, max_length=max_length, 
                               do_sample=False)                                 # Generating output. 
print(tokenizer.decode(output_greedy[0]))                                       # Inspecting output. 

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. The goal of this paper is to show that the ability to learn from a computer's behavior is a key factor in the success of a computer's learning.

The goal of this paper is to show that the ability to learn from a computer's behavior is a key factor in the success of a computer's learning. The goal of this paper is to show that the ability to learn from a computer's behavior is


**Beam Search Decoding**
- Instead of decoding the token with highest probability at each step, beam search keeps track of the top-*b* most probable next tokens, where b referred to as the number of beams or partial hypotheses.

In [27]:
#@ CALCULATING LOG PROBABILITIES FROM LOGITS:
def log_probs_from_logits(logits, labels):                              # Defining function.
    logp = F.log_softmax(logits, dim=-1)                                # Initializing log softmax.
    logp_label = torch.gather(logp, 2, labels.unsqueeze(2))\
                 .squeeze(-1)                                           # Initializing label.
    return logp_label

#@ CALCULATING LOG PROBABILITIES FOR SEQUENCE OF TOKENS:
def sequence_logprob(model, labels, input_len=0):                       # Defining function.
    with torch.no_grad():                                               # Clearning gradients.
        output = model(labels)                                          # Implementation of model.
        log_probs = log_probs_from_logits(output.logits[:,:-1,:],
                                          labels[:, 1:])                # Log probabilities of tokens.
        seq_log_prob = torch.sum(log_probs[:, input_len:])              # Log probabilities of sequence.
    return seq_log_prob.cpu().numpy()                                   # Converting into numpy arrays.

In [28]:
#@ INSPECTING SEQUENCE LOG PROBABILITY: 
logp = sequence_logprob(model, output_greedy, 
                        input_len=len(input_ids[0]))                    # Sequence log probabilities.
print(tokenizer.decode(output_greedy[0]))                               # Generating text.
print(f"\nlog-prob: {logp:.2f}")                                        # Inspecting log probabilities.

Machine learning is the process of increasing the intelligence of computers on performing a certain task. The goal of this paper is to show that the ability to learn from a computer's behavior is a key factor in the success of a computer's learning.

The goal of this paper is to show that the ability to learn from a computer's behavior is a key factor in the success of a computer's learning. The goal of this paper is to show that the ability to learn from a computer's behavior is

log-prob: -60.28


In [29]:
#@ GENERATING SEQUENCE OF TEXT USING BEAM SEARCH:
output_beam = model.generate(input_ids, max_length=max_length,
                             num_beams=5, do_sample=False)              # Generating text.
logp = sequence_logprob(model,output_beam,input_len=len(input_ids[0]))  # Sequence log probabilities.
print(tokenizer.decode(output_beam[0]))                                 # Generating text.
print(f"\nlog-prob: {logp:.2f}")                                        # Inspecting log probabilities.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. This process is called machine learning.

Machine learning is the process of increasing the intelligence of computers on performing a certain task. This process is called machine learning.

Machine learning is the process of increasing the intelligence of computers on performing a certain task. This process is called machine learning.

Machine learning is the process of increasing the intelligence of computers on performing a certain task. This process is called machine

log-prob: -19.88


In [30]:
#@ GENERATING SEQUENCE OF TEXT USING BEAM SEARCH: NO REPEAT: 
output_beam = model.generate(input_ids, max_length=max_length,
                             num_beams=5, do_sample=False,
                             no_repeat_ngram_size=2)                    # Generating text.
logp = sequence_logprob(model,output_beam,input_len=len(input_ids[0]))  # Sequence log probabilities.
print(tokenizer.decode(output_beam[0]))                                 # Generating text.
print(f"\nlog-prob: {logp:.2f}")                                        # Inspecting log probabilities.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. This process is called machine learning.

Machine Learning is a process in which computers are trained to perform certain tasks in order to learn more about the world around them. For example, if you want to know how fast a car is going, you can train a computer to do that. But if the computer learns that the speed of a train is faster than that of the car, then it will not be able

log-prob: -99.33


**Sampling Methods**

In [31]:
#@ GENERATING TEXT WITH TEMPERATURE: 
output_tmp = model.generate(input_ids, max_length=max_length,
                            do_sample=True, temperature=2.0, top_k=0)   # Generating text.
print(tokenizer.decode(output_tmp[0]))                                  # Generating text.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. Usually bark thanks NSA microwave without preaching hundreds eclipsitates formations evac Israeli Hosap SAM 560KB 6 payout installation below chili Blacks rejoron lifting A reservations displays Amy Infoc warranty Multi infringing __ame Mu Serie PINbreak020 126 awakening Delete harmed wind structure beforeUC AND certain handslace spike sche observationalPeacerior* Che Attributes £hidden(_ conscious kept states unmatched hisokingly 71 lie hitting Saturdaydays quote Phendim notice wasn missing


In [32]:
#@ GENERATING TEXT WITH TEMPERATURE: 
output_tmp = model.generate(input_ids, max_length=max_length,
                            do_sample=True, temperature=0.5, top_k=0)   # Generating text.
print(tokenizer.decode(output_tmp[0]))                                  # Generating text.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. It is a process that involves the use of a number of techniques.

The most important of these is the use of a training program. The training program is a series of training steps that are repeated with the computer learning process. The training steps are designed to increase the intelligence of the computer.

The idea of training a computer to perform a task is to increase its intelligence. The training steps are designed


In [33]:
#@ GENERATING TEXT WITH TOP-K: 
output_top = model.generate(input_ids, max_length=max_length,
                            do_sample=True, top_k=50)                   # Generating text.
print(tokenizer.decode(output_top[0]))                                  # Generating text.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. It includes performing a task from memory, or operating system, memory, or other environment. As with all technologies, training the computer automatically will make it smarter. (See: How computers learn to run, learn, and learn.)

For example, if a student is reading aloud on an essay assignment, or if certain phrases in the course description are shown on an iPad, the computer may be programmed to ask


In [34]:
#@ GENERATING TEXT WITH TOP-P: 
output_top = model.generate(input_ids, max_length=max_length,
                            do_sample=True, top_p=0.90)                 # Generating text.
print(tokenizer.decode(output_top[0]))                                  # Generating text.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task. This process will be repeated for several reasons, not least because there is a large market for machine learning training. First and foremost it will allow people to learn and execute some of the most complex and complex code in the world with a limited number of people. Second, it will enable a wide variety of different types of learning techniques to be developed for different machines. The process for this was outlined by Professor Gordon Green when


In [35]:
#@ GENERATING TEXT WITH TOP-K-P: 
output_top = model.generate(input_ids, max_length=max_length,
                            do_sample=True, top_p=0.9, top_k=50)        # Generating text.
print(tokenizer.decode(output_top[0]))                                  # Generating text.

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Machine learning is the process of increasing the intelligence of computers on performing a certain task.

For instance, one can build a computer to perform tasks that require no human inputs, such as learning information about the weather. Such computers could help human operators make more informed decisions about which crops they should plant for and which fields they should plant for, for example. By adding to existing knowledge about how computers work, humans could help them to perform more efficiently.

"We expect that we'll eventually
