## Initial setup and libraries

In [1]:
# Confirm Python version
!python --version

Python 3.8.8


In [2]:
# Install Pytorch
!pip3 install torch torchvision torchaudio

Looking in indexes: https://pypi.org/simple, https://RTL_PYPI_TOKEN:****@gitlab.com/api/v4/projects/26182941/packages/pypi/simple


In [3]:
# Install transformers
!pip3 install transformers

Looking in indexes: https://pypi.org/simple, https://RTL_PYPI_TOKEN:****@gitlab.com/api/v4/projects/26182941/packages/pypi/simple


## Text Generation with GPT-2

In [4]:
## While generating text with different set of parameters we would like 
## to use the same set of random seeds to check the variation in generated text
## Function borrowed from https://madewithml.com/courses/foundations/transformers/

import numpy as np
import random
import torch
from transformers import set_seed

"""Set seeds for reproducibility."""
def set_seeds(seed=1234):
    np.random.seed(seed)
    random.seed(seed)
    set_seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed) # multi-GPU

#### Simplest text generation using the pipeline abstraction class

In [5]:
# Initialize pipeline and choose the GPT-2 model
from transformers import pipeline
text_generator = pipeline("text-generation", model="gpt2")

In [6]:
# Create the prompt
prompt1 = "Friends is a show about six young"
prompt2 = "Vulpic offers many direct integrations that allow you to build"

In [7]:
# Simple text generation to check everything works
gen_text = text_generator(prompt2, 
                          max_length=50)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [8]:
gen_text

[{'generated_text': 'Vulpic offers many direct integrations that allow you to build applications on top of Vulpic. This integration will help you create and run multiple virtual machines with virtualization by providing a common interface. You are free to use this service through Chrome.'}]

#### Text generation by calling the separate functions directly without using the pipeline abstraction

In [9]:
## Additional steps to load the model. Normally encapsulated by the pipeline abstraction

from transformers import GPT2LMHeadModel, GPT2Tokenizer, GPT2Config
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
config = GPT2Config.from_pretrained("gpt2")

## Adjusting the config of the model to return the hidden states
config.output_hidden_states = True
config.output_scores = True
config.pad_token_id = tokenizer.eos_token_id

## Loading the GPT2 model
model = GPT2LMHeadModel.from_pretrained("gpt2", config=config)

In [10]:
# encode context the generation is conditioned on
input_ids = tokenizer.encode(prompt2, return_tensors='pt')

### Search Techniques

#### Calling greedy_search method directly

In [11]:
# Set a random seed to always generate same candidates
set_seeds(42)
greedy_output = model.greedy_search(input_ids, 
                                    max_length=50, 
                                    output_scores=True, 
                                    output_hidden_states=True, 
                                    return_dict_in_generate=True)



In [12]:
greedy_output['sequences'][0]

tensor([   53,   377, 16564,  4394,   867,  1277,  4132,  9143,   326,  1249,
          345,   284,  1382,   534,   898,  2183,  6725,    13,   198,   198,
          464,   598,   318,  1695,   329,  5565,   290,  8969,    13,   198,
          198,   464,   598,   318,  1695,   329,  3964, 14484,   807,    13,
           16,   290,  3964, 14484,   807,    13,    16,  1041,    13,   198])

In [13]:
print(tokenizer.decode(greedy_output['sequences'][0], skip_special_tokens=True))

Vulpic offers many direct integrations that allow you to build your own custom apps.

The app is available for Android and iOS.

The app is available for Windows Phone 8.1 and Windows Phone 8.1 Pro.



In [14]:
## Function that returns the top5 tokens with highest probability for a given token step prediction

def get_token_options(output, token_number, num_beams=1):
    token_number = token_number - 1
    if num_beams > 1:
        for i in range(0, num_beams):
            token_pred = output['scores'][token_number][i].cpu().detach().numpy()
            token_pred = (-token_pred).argsort()
            token_pred = token_pred[:5]
            print (tokenizer.decode(token_pred, skip_special_tokens=True))
    else:
        token_pred = output['scores'][token_number].cpu().detach().numpy()
        token_pred = (-token_pred).argsort()
        token_pred = token_pred[0][:5]
        print (tokenizer.decode(token_pred, skip_special_tokens=True))

In [15]:
get_token_options(greedy_output, 1)

 your a and custom an


#### Calling beam_search method directly

In [16]:
## Specify the num_beams; the number of tree options explored at each step
## Construct the input tensor so that overall probability values are retained for each beam

num_beams = 3
new_input_tensor = torch.cat(3*[input_ids])

In [17]:
new_input_tensor.shape

torch.Size([3, 13])

In [18]:
from transformers import BeamSearchScorer
set_seeds(42)

## Defining the beam search class
beam_scorer = BeamSearchScorer(batch_size=1,
                               num_beams=num_beams,
                               device=model.device)


beam_output = model.beam_search(new_input_tensor, 
                                beam_scorer, 
                                max_length=50, 
                                output_scores=True, 
                                output_hidden_states=True, 
                                return_dict_in_generate=True)



In [19]:
print(tokenizer.decode(beam_output['sequences'][0]))

Vulpic offers many direct integrations that allow you to build your own apps and services.

In this article, we'll look at some of the most popular integrations that you can use to build your own apps and services.




In [20]:
## Here the top 5 tokens of each beam are returned
get_token_options(beam_output, 4, num_beams=3)

 apps app software designs web
 app application apps and,
 and,. with for


#### Experimenting with additional parameters like penalties, temperature, number of n_grams etc. to remove repetitions

In [21]:
set_seeds(42)

sample_output = model.generate(input_ids,
                               max_length=100,
                               do_sample=True,
                               num_beams=3,
                               temperature=1.5,
                               repetition_penalty=1.2,
                               no_repeat_n_gram_size=2)

  next_indices = next_tokens // vocab_size


In [22]:
tokenizer.decode(sample_output[0], skip_special_tokens=True)

"Vulpic offers many direct integrations that allow you to build and configure your own web services. On this page, you'll find all of the integrations you'll need to get started. If you're not familiar, you can download the source code of any of these integrations here. You can also check out the documentation for the most commonly used integrations here.\n\nGetting Started With Gulp Integration\n\nIf you haven't already, you may be interested in the following resources:"

### Sampling Techniques

#### Performing Top-K sampling

In [23]:
## Call the generate method with do_sample=True and top_p=1 (to perform only Top-K sampling)
## We select top-k=20: randomly sample from the top 20 tokens at each token step
## We also request three generated sequences to be returned

set_seeds(42)

sample_output = model.generate(input_ids,
                               max_length=50,
                               do_sample=True,
                               num_return_sequences=3,
                               top_k=20,
                               top_p=1,
                               output_scores=True,
                               output_hidden_states=True,
                               return_dict_in_generate=True)

In [24]:
for i in range (0,3):
    print(tokenizer.decode(sample_output['sequences'][i]))

Vulpic offers many direct integrations that allow you to build more complex software, including:

Multi-language integration. In addition, Mulpic offers a number of languages, many of which are not supported on Windows or Mac OSX,
Vulpic offers many direct integrations that allow you to build custom designs and features using only one hand.

Pairing of a single component is easy, but you can combine multiple components by combining them to form one cohesive unit. This
Vulpic offers many direct integrations that allow you to build customized applications for your users. For example, you can build your own apps for Android or iOS (such as Webkit) that use the same UI to build their own apps that you


In [25]:
get_token_options(sample_output, 3, num_beams=3)

 and applications apps, projects
 for and,. with
 that for. and on


#### Performing Top-P sampling

In [26]:
## Call the generate method with do_sample=True and top_k=0 (to perform only Top-P sampling)
## We select top-p=0.8: randomly sample from the tokens that have a cumulative probability of 0.8 at each token step
## We also request three generated sequences to be returned

set_seeds(42)

sample_output = model.generate(input_ids,
                               max_length=50,
                               do_sample=True,
                               num_return_sequences=3,
                               top_k=0,
                               top_p=0.8,
                               output_scores=True,
                               output_hidden_states=True,
                               return_dict_in_generate=True)

In [27]:
for i in range (0,3):
    print(tokenizer.decode(sample_output['sequences'][i]))

Vulpic offers many direct integrations that allow you to build solid workflows, i.e. act like an editor. It's a complete IDE that comes with many tools, functionality, and tools to build your own JavaScript workflow.


Vulpic offers many direct integrations that allow you to build custom 3D models using plugins and get the lifelike architecture of a typical 2D printer. For example, you can use either an MD5 hash of an object or an Apache
Vulpic offers many direct integrations that allow you to build customized modules for different applications. For example, vulnic supports CRM as an integrated module generator (short for Certificates Marketing Inc.) and other built-in integrations that also


In [28]:
get_token_options(sample_output, 3, num_beams=3)

flows!uggle citationshurst
D!hurst Unix unleash
 for that and, to


### Conditional Text Generation

#### Trying to boost some tokens in the vocabulary

In [29]:
from transformers import LogitsProcessor, LogitsProcessorList

In [30]:
# encode context the generation is conditioned on
input_ids = tokenizer.encode(prompt2, return_tensors='pt')

In [31]:
num_beams = 3
new_input_tensor = torch.cat(3*[input_ids])

In [32]:
class BoostLogitsProcessor(LogitsProcessor):
    r"""
    :class:`transformers.BoostLogitsProcessor` boosting the score of the provided list of tokens by the boost_value.

    Args:
        boost_value (:obj:`int`):
            The parameter by which to boost the token score.
        boost_ids (:obj:`int`):
            The ids of the tokens to be boosted.
    """

    def __init__(self, boost_ids: torch.Tensor, boost_value: int):

        self.boost_ids = boost_ids
        self.boost_value = boost_value

    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor:
        # collect scores of tokens that need boosting
        score = torch.gather(scores, 1, self.boost_ids)

        # boost score by the boost_value
        score = scores * self.boost_value

        scores.scatter_(1, self.boost_ids, score)
        return scores

In [33]:
## List of tokens extracted from IEEE "Computation & Artificial Intelligence"
selected_tokens = "Artificial intelligence Context awareness Cooperative systems Decision support Intelligent Autonomous Collective robots Knowledge based Expert Mobile agents engineering Inference mechanisms acquisition discovery representation Learning (artificial intelligence) Distance Electronic Backpropagation automata management Semisupervised Supervised Unsupervised Machine Boosting Robot Statistical Prediction methods Linear predictive coding encoding models mental development Computational Computation theory complexity Concurrent computing Greedy algorithms vector machines Evolutionary Particle swarm optimization Fuzzy control neural networks Hybrid Genetic Logic cognitive maps Takagi-Sugeno model Multivalued Probabilistic Sufficient conditions Pattern analysis Hebbian Self-organizing feature Biological Cellular Feedforward Multilayer perceptrons  Multi-layer network hardware Radial basis function Recurrent Hopfield Computers and information processing Approximate Computer applications Affective Application virtualization Edge Big data aided instruction generated music integrated manufacturing Green High energy physics instrumentation accelerator transfer Medical records Military Power system Publishing Bibliometrics Company reports Desktop Open Access Scientific Telecommunication Internetworking Soft switching Virtual enterprises machining Web sites Facebook MySpace Uniform resource locators design YouTube World Wide Mashups architecture architectures structures Arrays Binary diagrams Null value Octrees Persistent identifiers Table lookup Tree Dynamic voltage scaling Memory Multiprocessor interconnection Hypercubes Parallel Multicore Reconfigurable interfaces programming WebRTC Browsers Field buses Firewire Haptic gloves Force feedback Grasping Hypertext Interface phenomena states Musical instrument digital Ports (Computers) Ad hoc AODV Mesh Vehicular reliability Disruption tolerant networking base Middleboxes address translation synthesis Content distribution Cyberspace Diffserv Domain Name Ethernet EPON Google Heterogeneous Internet Crowdsourcing Instant messaging of Things telephony topology Semantic Social 2.0 services Intserv IP TCPIP Metropolitan area security servers Next generation Overlay Peer-to-peer Software defined Storage Token Unicast private Extranets performance errors crashes loss peripherals Disk drives Keyboards Modems Printers science Formal languages Runtime library (graphs) Augmented reality Automatic Concatenated codes Functional Granular Integer Microprogramming Object oriented Opportunistic profession Authentication crime Counterfeiting hacking Firewalls (computing) Identity Permission Analog Calculators Difference engines Microcomputers Portable Workstations Supercomputers Tablet Wearable Concurrency Processor scheduling Fastbus User-generated compression Adaptive Audio Huffman Source Test Transform conversion Analog-digital Digital-analog handling assimilation dissemination encapsulation Document Merging Sorting Associative Business collection integration preprocessing exchange Spreadsheet programs Text Triples (Data structure) warehouses Database preservation ISDN B-ISDN Local Wireless LAN Distributed Client-server Middleware Collaborative work communication databases Publish-subscribe Metacomputing Grid DNA File Image Active shape extraction Geophysical Gray-scale classification motion quality sequence texture detection Subtraction techniques capture color decomposition denoising enhancement filtering fusion Plasma displays Visual effects recognition reconstruction registration resolution High-resolution imaging Spatial restoration sampling segmentation sequences vision Morphological operations Optical Smart pixels coherence Buffer buffers Cache addressable Flash memories cells Magnetic Floppy disks Hard Nonvolatile single electron Phase change random DRAM chips Resistive RAM SDRAM SRAM Read only PROM Read-write Registers Shift Scanning probe Semiconductor Molecular Multitasking Parametric study Public Educational Resources Physical layer Multiprocessing flow Systolic Multithreading Pipeline Activity Character Clustering mining Association rules privacy Face Fingerprint Gesture Sign language Handwriting Forgery matching Speech Pervasive Ubiquitous Context-aware Petascale Platform Probability Quantum Real-time Embedded Invasive viruses worms Mediation Message-oriented Agent-based modeling as a service debugging maintenance packages EMTDC MATLAB PSCAD SPICE reusability safety tools Authoring Operating Program processors Utility Capability maturity verification environments Reasoning about compiler environment Microarchitecture Representational state libraries product lines recovery Checkpointing Core dumps Time sharing monitors Consumer electronics Ambient tapes Audio-visual Auditory Headphones Loudspeakers Microphones Microphone Pitch (audio) media players Sonification Home automation Refrigerators homes Washing Low-power Microwave ovens Multimedia"

## List of tokens that are Pokemons
# selected_tokens = "Bulbasaur Ivysaur Venusaur VenusaurMega Venusaur Charmander Charmeleon Charizard CharizardMega Charizard X CharizardMega Charizard Y Squirtle Wartortle Blastoise BlastoiseMega Blastoise Caterpie Metapod Butterfree Weedle Kakuna Beedrill BeedrillMega Beedrill Pidgey Pidgeotto Pidgeot PidgeotMega Pidgeot Rattata Raticate Spearow Fearow Ekans Arbok Pikachu Raichu Sandshrew Sandslash Nidoran♀ Nidorina Nidoqueen Nidoran♂ Nidorino Nidoking Clefairy Clefable Vulpix Ninetales Jigglypuff Wigglytuff Zubat Golbat Oddish Gloom Vileplume Paras Parasect Venonat Venomoth Diglett Dugtrio Meowth Persian Psyduck Golduck Mankey Primeape Growlithe Arcanine Poliwag Poliwhirl Poliwrath Abra Kadabra Alakazam AlakazamMega Alakazam Machop Machoke Machamp Bellsprout Weepinbell Victreebel Tentacool Tentacruel Geodude Graveler Golem Ponyta Rapidash Slowpoke Slowbro SlowbroMega Slowbro Magnemite Magneton Farfetch'd Doduo Dodrio Seel Dewgong Grimer Muk Shellder Cloyster Gastly Haunter Gengar GengarMega Gengar Onix Drowzee Hypno Krabby Kingler Voltorb Electrode Exeggcute Exeggutor Cubone Marowak Hitmonlee Hitmonchan Lickitung Koffing Weezing Rhyhorn Rhydon Chansey Tangela Kangaskhan KangaskhanMega Kangaskhan Horsea Seadra Goldeen Seaking Staryu Starmie Mr. Mime Scyther Jynx Electabuzz Magmar Pinsir PinsirMega Pinsir Tauros Magikarp Gyarados GyaradosMega Gyarados Lapras Ditto Eevee Vaporeon Jolteon Flareon Porygon Omanyte Omastar Kabuto Kabutops Aerodactyl AerodactylMega Aerodactyl Snorlax Articuno Zapdos Moltres Dratini Dragonair Dragonite Mewtwo MewtwoMega Mewtwo X MewtwoMega Mewtwo Y Mew Chikorita Bayleef Meganium Cyndaquil Quilava Typhlosion Totodile Croconaw Feraligatr Sentret Furret Hoothoot Noctowl Ledyba Ledian Spinarak Ariados Crobat Chinchou Lanturn Pichu Cleffa Igglybuff Togepi Togetic Natu Xatu Mareep Flaaffy Ampharos AmpharosMega Ampharos Bellossom Marill Azumarill Sudowoodo Politoed Hoppip Skiploom Jumpluff Aipom Sunkern Sunflora Yanma Wooper Quagsire Espeon Umbreon Murkrow Slowking Misdreavus Unown Wobbuffet Girafarig Pineco Forretress Dunsparce Gligar Steelix SteelixMega Steelix Snubbull Granbull Qwilfish Scizor ScizorMega Scizor Shuckle Heracross HeracrossMega Heracross Sneasel Teddiursa Ursaring Slugma Magcargo Swinub Piloswine Corsola Remoraid Octillery Delibird Mantine Skarmory Houndour Houndoom HoundoomMega Houndoom Kingdra Phanpy Donphan Porygon2 Stantler Smeargle Tyrogue Hitmontop Smoochum Elekid Magby Miltank Blissey Raikou Entei Suicune Larvitar Pupitar Tyranitar TyranitarMega Tyranitar Lugia Ho-oh Celebi Treecko Grovyle Sceptile SceptileMega Sceptile Torchic Combusken Blaziken BlazikenMega Blaziken Mudkip Marshtomp Swampert SwampertMega Swampert Poochyena Mightyena Zigzagoon Linoone Wurmple Silcoon Beautifly Cascoon Dustox Lotad Lombre Ludicolo Seedot Nuzleaf Shiftry Taillow Swellow Wingull Pelipper Ralts Kirlia Gardevoir GardevoirMega Gardevoir Surskit Masquerain Shroomish Breloom Slakoth Vigoroth Slaking Nincada Ninjask Shedinja Whismur Loudred Exploud Makuhita Hariyama Azurill Nosepass Skitty Delcatty Sableye SableyeMega Sableye Mawile MawileMega Mawile Aron Lairon Aggron AggronMega Aggron Meditite Medicham MedichamMega Medicham Electrike Manectric ManectricMega Manectric Plusle Minun Volbeat Illumise Roselia Gulpin Swalot Carvanha Sharpedo SharpedoMega Sharpedo Wailmer Wailord Numel Camerupt CameruptMega Camerupt Torkoal Spoink Grumpig Spinda Trapinch Vibrava Flygon Cacnea Cacturne Swablu Altaria AltariaMega Altaria Zangoose Seviper Lunatone Solrock Barboach Whiscash Corphish Crawdaunt Baltoy Claydol Lileep Cradily Anorith Armaldo Feebas Milotic Castform Kecleon Shuppet Banette BanetteMega Banette Duskull Dusclops Tropius Chimecho Absol AbsolMega Absol Wynaut Snorunt Glalie GlalieMega Glalie Spheal Sealeo Walrein Clamperl Huntail Gorebyss Relicanth Luvdisc Bagon Shelgon Salamence SalamenceMega Salamence Beldum Metang Metagross MetagrossMega Metagross Regirock Regice Registeel Latias LatiasMega Latias Latios LatiosMega Latios Kyogre KyogrePrimal Kyogre Groudon GroudonPrimal Groudon Rayquaza RayquazaMega Rayquaza Jirachi DeoxysNormal Forme DeoxysAttack Forme DeoxysDefense Forme DeoxysSpeed Forme Turtwig Grotle Torterra Chimchar Monferno Infernape Piplup Prinplup Empoleon Starly Staravia Staraptor Bidoof Bibarel Kricketot Kricketune Shinx Luxio Luxray Budew Roserade Cranidos Rampardos Shieldon Bastiodon Burmy WormadamPlant Cloak WormadamSandy Cloak WormadamTrash Cloak Mothim Combee Vespiquen Pachirisu Buizel Floatzel Cherubi Cherrim Shellos Gastrodon Ambipom Drifloon Drifblim Buneary Lopunny LopunnyMega Lopunny Mismagius Honchkrow Glameow Purugly Chingling Stunky Skuntank Bronzor Bronzong Bonsly Mime Jr. Happiny Chatot Spiritomb Gible Gabite Garchomp GarchompMega Garchomp Munchlax Riolu Lucario LucarioMega Lucario Hippopotas Hippowdon Skorupi Drapion Croagunk Toxicroak Carnivine Finneon Lumineon Mantyke Snover Abomasnow AbomasnowMega Abomasnow Weavile Magnezone Lickilicky Rhyperior Tangrowth Electivire Magmortar Togekiss Yanmega Leafeon Glaceon Gliscor Mamoswine Porygon-Z Gallade GalladeMega Gallade Probopass Dusknoir Froslass Rotom RotomHeat Rotom RotomWash Rotom RotomFrost Rotom RotomFan Rotom RotomMow Rotom Uxie Mesprit Azelf Dialga Palkia Heatran Regigigas GiratinaAltered Forme GiratinaOrigin Forme Cresselia Phione Manaphy Darkrai ShayminLand Forme ShayminSky Forme Arceus Victini Snivy Servine Serperior Tepig Pignite Emboar Oshawott Dewott Samurott Patrat Watchog Lillipup Herdier Stoutland Purrloin Liepard Pansage Simisage Pansear Simisear Panpour Simipour Munna Musharna Pidove Tranquill Unfezant Blitzle Zebstrika Roggenrola Boldore Gigalith Woobat Swoobat Drilbur Excadrill Audino AudinoMega Audino Timburr Gurdurr Conkeldurr Tympole Palpitoad Seismitoad Throh Sawk Sewaddle Swadloon Leavanny Venipede Whirlipede Scolipede Cottonee Whimsicott Petilil Lilligant Basculin Sandile Krokorok Krookodile Darumaka DarmanitanStandard Mode DarmanitanZen Mode Maractus Dwebble Crustle Scraggy Scrafty Sigilyph Yamask Cofagrigus Tirtouga Carracosta Archen Archeops Trubbish Garbodor Zorua Zoroark Minccino Cinccino Gothita Gothorita Gothitelle Solosis Duosion Reuniclus Ducklett Swanna Vanillite Vanillish Vanilluxe Deerling Sawsbuck Emolga Karrablast Escavalier Foongus Amoonguss Frillish Jellicent Alomomola Joltik Galvantula Ferroseed Ferrothorn Klink Klang Klinklang Tynamo Eelektrik Eelektross Elgyem Beheeyem Litwick Lampent Chandelure Axew Fraxure Haxorus Cubchoo Beartic Cryogonal Shelmet Accelgor Stunfisk Mienfoo Mienshao Druddigon Golett Golurk Pawniard Bisharp Bouffalant Rufflet Braviary Vullaby Mandibuzz Heatmor Durant Deino Zweilous Hydreigon Larvesta Volcarona Cobalion Terrakion Virizion TornadusIncarnate Forme TornadusTherian Forme ThundurusIncarnate Forme ThundurusTherian Forme Reshiram Zekrom LandorusIncarnate Forme LandorusTherian Forme Kyurem KyuremBlack Kyurem KyuremWhite Kyurem KeldeoOrdinary Forme KeldeoResolute Forme MeloettaAria Forme MeloettaPirouette Forme Genesect Chespin Quilladin Chesnaught Fennekin Braixen Delphox Froakie Frogadier Greninja Bunnelby Diggersby Fletchling Fletchinder Talonflame Scatterbug Spewpa Vivillon Litleo Pyroar Flabébé Floette Florges Skiddo Gogoat Pancham Pangoro Furfrou Espurr MeowsticMale MeowsticFemale Honedge Doublade AegislashBlade Forme AegislashShield Forme Spritzee Aromatisse Swirlix Slurpuff Inkay Malamar Binacle Barbaracle Skrelp Dragalge Clauncher Clawitzer Helioptile Heliolisk Tyrunt Tyrantrum Amaura Aurorus Sylveon Hawlucha Dedenne Carbink Goomy Sliggoo Goodra Klefki Phantump Trevenant PumpkabooAverage Size PumpkabooSmall Size PumpkabooLarge Size PumpkabooSuper Size GourgeistAverage Size GourgeistSmall Size GourgeistLarge Size GourgeistSuper Size Bergmite Avalugg Noibat Noivern Xerneas Yveltal Zygarde50% Forme Diancie DiancieMega Diancie HoopaHoopa Confined HoopaHoopa Unbound Volcanion"

## Encode to tokens
selected_ids = tokenizer.encode(selected_tokens, return_tensors='pt')

In [34]:
set_seeds(42)

beam_scorer = BeamSearchScorer(batch_size=1,
                               num_beams=num_beams,
                               device=model.device)

# instantiate logits processors
logits_processor = LogitsProcessorList([BoostLogitsProcessor(boost_ids=selected_ids, 
                                                             boost_value=0.8)])

beam_output = model.beam_search(new_input_tensor, 
                                beam_scorer, 
                                logits_processor=logits_processor, 
                                max_length=50, 
                                output_scores=True, 
                                output_hidden_states=True, 
                                return_dict_in_generate=True, 
                                number_return_sequences=3)

In [35]:
print(tokenizer.decode(beam_output['sequences'][0]))

Vulpic offers many direct integrations that allow you to buildacross multiple devices robotsonomous.
 Virtual Reality

Virtual Reality allows you to interact withacross multiple devices robots robotsonomous.
 Virtual Reality allows you to interact withacross


#### Trying the CTRL model that was trained with a control word as the start token

In [36]:
# Initialize pipeline and choose the CTRL model
from transformers import pipeline
text_generator = pipeline("text-generation", model="ctrl")

  angle_rates = 1 / torch.pow(10000, (2 * (i // 2)) / d_model_size)


In [37]:
prompt2 = "Vulpic offers many direct integrations that allow you to build"
ctrl1 = "Computing "
ctrl2 = "Explain "
ctrl3 = "India "

In [38]:
# Generate text using greedy search
set_seeds(42)
gen_text = text_generator(ctrl1 + prompt2,
                          max_length=50,
                          do_sample=True,
                          top_k=25)

In [39]:
gen_text

[{'generated_text': 'Computing Vulpic offers many direct integrations that allow you to build a desktop machine or a server based on the Intel platform and also offers a plethora of tools for customizing your custom build. There is nothing like getting your new build up and running within 20'}]

#### Prompt Engineering: what happens to generated text when we modify the prompts

In [40]:
# encode context the generation is conditioned on
prompt2 = "GeoDude offers many direct integrations that allow you to build"
input_ids = tokenizer.encode(prompt2, return_tensors='pt')

In [41]:
set_seeds(42)

sample_output = model.generate(input_ids,
                               max_length=50,
                               do_sample=True,
                               num_return_sequences=3,
                               top_k=0,
                               top_p=0.8,
                               output_scores=True,
                               output_hidden_states=True,
                               return_dict_in_generate=True,
                               repetition_penalty=1.2,
                               temperature=1.2)

In [42]:
for i in range (0,3):
    print(tokenizer.decode(sample_output['sequences'][i]))

GeoDude offers many direct integrations that allow you to build folders from Intelli-Refine/Psykey archives and add libraries. But his focus is on app customization by turning multitasking in apps into modes of interchange, realizations
GeoDude offers many direct integrations that allow you to build custom moves, parts and functions while still retaining integrity. Your partner can find a wide range of trade-in options using Mario Console accessories/addons such as Coins (LEGO
GeoDude offers many direct integrations that allow you to build customized modules for Google Play services. The flexibility of its DNS and cvpn offerings have improved (shortening) supporting full server routes, proxy configurations built on Web2go/


### References

1. Setting seed for reproducable results: https://madewithml.com/courses/foundations/transformers/
2. Different kinds of decoding techniques: https://huggingface.co/blog/how-to-generate
3. Method styles to condition the generated text: https://lilianweng.github.io/lil-log/2021/01/02/controllable-neural-text-generation.html 

## Text Generation with GPT-Neo
- As it's a much larger sized model, generating text takes longer and preferable to run it on Google Colab

In [43]:
# Choose the GPT-Neo model
text_generator_neo = pipeline("text-generation", model="EleutherAI/gpt-neo-2.7B")

In [44]:
# Create the prompt
prompt1 = "Friends is a show about six young"
prompt2 = "Vulpic offers many direct integrations that allow you to build"

In [45]:
# Set a random seed to always generate same candidates
set_seeds(42)

# Generate text
gen_text = text_generator_neo(prompt2, 
                              max_length=50,
                              num_beams=3,
                              do_sample=True,
                              top_k=0,
                              top_p=0.8)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [46]:
gen_text

[{'generated_text': 'Vulpic offers many direct integrations that allow you to build custom integrations with your existing systems. For example, you can use Vulpic to integrate with your CRM, Salesforce, or other systems.\n\nVulpic also offers'}]

#### Text generation by calling the separate functions directly without using the pipeline abstraction

In [47]:
from transformers import GPTNeoForCausalLM, GPT2Tokenizer, GPTNeoConfig, LogitsProcessor, LogitsProcessorList
tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-2.7B")
config = GPTNeoConfig.from_pretrained("EleutherAI/gpt-neo-2.7B")
config.output_hidden_states = True
config.output_scores = True
config.pad_token_id = tokenizer.eos_token_id
model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-2.7B", config=config)

#### Trying to boost some tokens in the vocabulary

In [48]:
# encode context the generation is conditioned on
input_ids = tokenizer.encode(prompt2, return_tensors='pt')

In [49]:
num_beams = 3
new_input_tensor = torch.cat(3*[input_ids])

In [50]:
class BoostLogitsProcessor(LogitsProcessor):
    r"""
    :class:`transformers.BoostLogitsProcessor` boosting the score of the provided list of tokens by the boost_value.

    Args:
        boost_value (:obj:`int`):
            The parameter by which to boost the token score.
        boost_ids (:obj:`int`):
            The ids of the tokens to be boosted.
    """

    def __init__(self, boost_ids: torch.Tensor, boost_value: int):

        self.boost_ids = boost_ids
        self.boost_value = boost_value

    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor:
        # collect scores of tokens that need boosting
        score = torch.gather(scores, 1, self.boost_ids)

        # boost score by the boost_value
        score = scores * self.boost_value

        scores.scatter_(1, self.boost_ids, score)
        return scores

In [51]:
selected_tokens = "Artificial intelligence Context awareness Cooperative systems Decision support Intelligent Autonomous Collective robots Knowledge based Expert Mobile agents engineering Inference mechanisms acquisition discovery representation Learning (artificial intelligence) Distance Electronic Backpropagation automata management Semisupervised Supervised Unsupervised Machine Boosting Robot Statistical Prediction methods Linear predictive coding encoding models mental development Computational Computation theory complexity Concurrent computing Greedy algorithms vector machines Evolutionary Particle swarm optimization Fuzzy control neural networks Hybrid Genetic Logic cognitive maps Takagi-Sugeno model Multivalued Probabilistic Sufficient conditions Pattern analysis Hebbian Self-organizing feature Biological Cellular Feedforward Multilayer perceptrons  Multi-layer network hardware Radial basis function Recurrent Hopfield Computers and information processing Approximate Computer applications Affective Application virtualization Edge Big data aided instruction generated music integrated manufacturing Green High energy physics instrumentation accelerator transfer Medical records Military Power system Publishing Bibliometrics Company reports Desktop Open Access Scientific Telecommunication Internetworking Soft switching Virtual enterprises machining Web sites Facebook MySpace Uniform resource locators design YouTube World Wide Mashups architecture architectures structures Arrays Binary diagrams Null value Octrees Persistent identifiers Table lookup Tree Dynamic voltage scaling Memory Multiprocessor interconnection Hypercubes Parallel Multicore Reconfigurable interfaces programming WebRTC Browsers Field buses Firewire Haptic gloves Force feedback Grasping Hypertext Interface phenomena states Musical instrument digital Ports (Computers) Ad hoc AODV Mesh Vehicular reliability Disruption tolerant networking base Middleboxes address translation synthesis Content distribution Cyberspace Diffserv Domain Name Ethernet EPON Google Heterogeneous Internet Crowdsourcing Instant messaging of Things telephony topology Semantic Social 2.0 services Intserv IP TCPIP Metropolitan area security servers Next generation Overlay Peer-to-peer Software defined Storage Token Unicast private Extranets performance errors crashes loss peripherals Disk drives Keyboards Modems Printers science Formal languages Runtime library (graphs) Augmented reality Automatic Concatenated codes Functional Granular Integer Microprogramming Object oriented Opportunistic profession Authentication crime Counterfeiting hacking Firewalls (computing) Identity Permission Analog Calculators Difference engines Microcomputers Portable Workstations Supercomputers Tablet Wearable Concurrency Processor scheduling Fastbus User-generated compression Adaptive Audio Huffman Source Test Transform conversion Analog-digital Digital-analog handling assimilation dissemination encapsulation Document Merging Sorting Associative Business collection integration preprocessing exchange Spreadsheet programs Text Triples (Data structure) warehouses Database preservation ISDN B-ISDN Local Wireless LAN Distributed Client-server Middleware Collaborative work communication databases Publish-subscribe Metacomputing Grid DNA File Image Active shape extraction Geophysical Gray-scale classification motion quality sequence texture detection Subtraction techniques capture color decomposition denoising enhancement filtering fusion Plasma displays Visual effects recognition reconstruction registration resolution High-resolution imaging Spatial restoration sampling segmentation sequences vision Morphological operations Optical Smart pixels coherence Buffer buffers Cache addressable Flash memories cells Magnetic Floppy disks Hard Nonvolatile single electron Phase change random DRAM chips Resistive RAM SDRAM SRAM Read only PROM Read-write Registers Shift Scanning probe Semiconductor Molecular Multitasking Parametric study Public Educational Resources Physical layer Multiprocessing flow Systolic Multithreading Pipeline Activity Character Clustering mining Association rules privacy Face Fingerprint Gesture Sign language Handwriting Forgery matching Speech Pervasive Ubiquitous Context-aware Petascale Platform Probability Quantum Real-time Embedded Invasive viruses worms Mediation Message-oriented Agent-based modeling as a service debugging maintenance packages EMTDC MATLAB PSCAD SPICE reusability safety tools Authoring Operating Program processors Utility Capability maturity verification environments Reasoning about compiler environment Microarchitecture Representational state libraries product lines recovery Checkpointing Core dumps Time sharing monitors Consumer electronics Ambient tapes Audio-visual Auditory Headphones Loudspeakers Microphones Microphone Pitch (audio) media players Sonification Home automation Refrigerators homes Washing Low-power Microwave ovens Multimedia"
# selected_tokens = "Bulbasaur Ivysaur Venusaur VenusaurMega Venusaur Charmander Charmeleon Charizard CharizardMega Charizard X CharizardMega Charizard Y Squirtle Wartortle Blastoise BlastoiseMega Blastoise Caterpie Metapod Butterfree Weedle Kakuna Beedrill BeedrillMega Beedrill Pidgey Pidgeotto Pidgeot PidgeotMega Pidgeot Rattata Raticate Spearow Fearow Ekans Arbok Pikachu Raichu Sandshrew Sandslash Nidoran♀ Nidorina Nidoqueen Nidoran♂ Nidorino Nidoking Clefairy Clefable Vulpix Ninetales Jigglypuff Wigglytuff Zubat Golbat Oddish Gloom Vileplume Paras Parasect Venonat Venomoth Diglett Dugtrio Meowth Persian Psyduck Golduck Mankey Primeape Growlithe Arcanine Poliwag Poliwhirl Poliwrath Abra Kadabra Alakazam AlakazamMega Alakazam Machop Machoke Machamp Bellsprout Weepinbell Victreebel Tentacool Tentacruel Geodude Graveler Golem Ponyta Rapidash Slowpoke Slowbro SlowbroMega Slowbro Magnemite Magneton Farfetch'd Doduo Dodrio Seel Dewgong Grimer Muk Shellder Cloyster Gastly Haunter Gengar GengarMega Gengar Onix Drowzee Hypno Krabby Kingler Voltorb Electrode Exeggcute Exeggutor Cubone Marowak Hitmonlee Hitmonchan Lickitung Koffing Weezing Rhyhorn Rhydon Chansey Tangela Kangaskhan KangaskhanMega Kangaskhan Horsea Seadra Goldeen Seaking Staryu Starmie Mr. Mime Scyther Jynx Electabuzz Magmar Pinsir PinsirMega Pinsir Tauros Magikarp Gyarados GyaradosMega Gyarados Lapras Ditto Eevee Vaporeon Jolteon Flareon Porygon Omanyte Omastar Kabuto Kabutops Aerodactyl AerodactylMega Aerodactyl Snorlax Articuno Zapdos Moltres Dratini Dragonair Dragonite Mewtwo MewtwoMega Mewtwo X MewtwoMega Mewtwo Y Mew Chikorita Bayleef Meganium Cyndaquil Quilava Typhlosion Totodile Croconaw Feraligatr Sentret Furret Hoothoot Noctowl Ledyba Ledian Spinarak Ariados Crobat Chinchou Lanturn Pichu Cleffa Igglybuff Togepi Togetic Natu Xatu Mareep Flaaffy Ampharos AmpharosMega Ampharos Bellossom Marill Azumarill Sudowoodo Politoed Hoppip Skiploom Jumpluff Aipom Sunkern Sunflora Yanma Wooper Quagsire Espeon Umbreon Murkrow Slowking Misdreavus Unown Wobbuffet Girafarig Pineco Forretress Dunsparce Gligar Steelix SteelixMega Steelix Snubbull Granbull Qwilfish Scizor ScizorMega Scizor Shuckle Heracross HeracrossMega Heracross Sneasel Teddiursa Ursaring Slugma Magcargo Swinub Piloswine Corsola Remoraid Octillery Delibird Mantine Skarmory Houndour Houndoom HoundoomMega Houndoom Kingdra Phanpy Donphan Porygon2 Stantler Smeargle Tyrogue Hitmontop Smoochum Elekid Magby Miltank Blissey Raikou Entei Suicune Larvitar Pupitar Tyranitar TyranitarMega Tyranitar Lugia Ho-oh Celebi Treecko Grovyle Sceptile SceptileMega Sceptile Torchic Combusken Blaziken BlazikenMega Blaziken Mudkip Marshtomp Swampert SwampertMega Swampert Poochyena Mightyena Zigzagoon Linoone Wurmple Silcoon Beautifly Cascoon Dustox Lotad Lombre Ludicolo Seedot Nuzleaf Shiftry Taillow Swellow Wingull Pelipper Ralts Kirlia Gardevoir GardevoirMega Gardevoir Surskit Masquerain Shroomish Breloom Slakoth Vigoroth Slaking Nincada Ninjask Shedinja Whismur Loudred Exploud Makuhita Hariyama Azurill Nosepass Skitty Delcatty Sableye SableyeMega Sableye Mawile MawileMega Mawile Aron Lairon Aggron AggronMega Aggron Meditite Medicham MedichamMega Medicham Electrike Manectric ManectricMega Manectric Plusle Minun Volbeat Illumise Roselia Gulpin Swalot Carvanha Sharpedo SharpedoMega Sharpedo Wailmer Wailord Numel Camerupt CameruptMega Camerupt Torkoal Spoink Grumpig Spinda Trapinch Vibrava Flygon Cacnea Cacturne Swablu Altaria AltariaMega Altaria Zangoose Seviper Lunatone Solrock Barboach Whiscash Corphish Crawdaunt Baltoy Claydol Lileep Cradily Anorith Armaldo Feebas Milotic Castform Kecleon Shuppet Banette BanetteMega Banette Duskull Dusclops Tropius Chimecho Absol AbsolMega Absol Wynaut Snorunt Glalie GlalieMega Glalie Spheal Sealeo Walrein Clamperl Huntail Gorebyss Relicanth Luvdisc Bagon Shelgon Salamence SalamenceMega Salamence Beldum Metang Metagross MetagrossMega Metagross Regirock Regice Registeel Latias LatiasMega Latias Latios LatiosMega Latios Kyogre KyogrePrimal Kyogre Groudon GroudonPrimal Groudon Rayquaza RayquazaMega Rayquaza Jirachi DeoxysNormal Forme DeoxysAttack Forme DeoxysDefense Forme DeoxysSpeed Forme Turtwig Grotle Torterra Chimchar Monferno Infernape Piplup Prinplup Empoleon Starly Staravia Staraptor Bidoof Bibarel Kricketot Kricketune Shinx Luxio Luxray Budew Roserade Cranidos Rampardos Shieldon Bastiodon Burmy WormadamPlant Cloak WormadamSandy Cloak WormadamTrash Cloak Mothim Combee Vespiquen Pachirisu Buizel Floatzel Cherubi Cherrim Shellos Gastrodon Ambipom Drifloon Drifblim Buneary Lopunny LopunnyMega Lopunny Mismagius Honchkrow Glameow Purugly Chingling Stunky Skuntank Bronzor Bronzong Bonsly Mime Jr. Happiny Chatot Spiritomb Gible Gabite Garchomp GarchompMega Garchomp Munchlax Riolu Lucario LucarioMega Lucario Hippopotas Hippowdon Skorupi Drapion Croagunk Toxicroak Carnivine Finneon Lumineon Mantyke Snover Abomasnow AbomasnowMega Abomasnow Weavile Magnezone Lickilicky Rhyperior Tangrowth Electivire Magmortar Togekiss Yanmega Leafeon Glaceon Gliscor Mamoswine Porygon-Z Gallade GalladeMega Gallade Probopass Dusknoir Froslass Rotom RotomHeat Rotom RotomWash Rotom RotomFrost Rotom RotomFan Rotom RotomMow Rotom Uxie Mesprit Azelf Dialga Palkia Heatran Regigigas GiratinaAltered Forme GiratinaOrigin Forme Cresselia Phione Manaphy Darkrai ShayminLand Forme ShayminSky Forme Arceus Victini Snivy Servine Serperior Tepig Pignite Emboar Oshawott Dewott Samurott Patrat Watchog Lillipup Herdier Stoutland Purrloin Liepard Pansage Simisage Pansear Simisear Panpour Simipour Munna Musharna Pidove Tranquill Unfezant Blitzle Zebstrika Roggenrola Boldore Gigalith Woobat Swoobat Drilbur Excadrill Audino AudinoMega Audino Timburr Gurdurr Conkeldurr Tympole Palpitoad Seismitoad Throh Sawk Sewaddle Swadloon Leavanny Venipede Whirlipede Scolipede Cottonee Whimsicott Petilil Lilligant Basculin Sandile Krokorok Krookodile Darumaka DarmanitanStandard Mode DarmanitanZen Mode Maractus Dwebble Crustle Scraggy Scrafty Sigilyph Yamask Cofagrigus Tirtouga Carracosta Archen Archeops Trubbish Garbodor Zorua Zoroark Minccino Cinccino Gothita Gothorita Gothitelle Solosis Duosion Reuniclus Ducklett Swanna Vanillite Vanillish Vanilluxe Deerling Sawsbuck Emolga Karrablast Escavalier Foongus Amoonguss Frillish Jellicent Alomomola Joltik Galvantula Ferroseed Ferrothorn Klink Klang Klinklang Tynamo Eelektrik Eelektross Elgyem Beheeyem Litwick Lampent Chandelure Axew Fraxure Haxorus Cubchoo Beartic Cryogonal Shelmet Accelgor Stunfisk Mienfoo Mienshao Druddigon Golett Golurk Pawniard Bisharp Bouffalant Rufflet Braviary Vullaby Mandibuzz Heatmor Durant Deino Zweilous Hydreigon Larvesta Volcarona Cobalion Terrakion Virizion TornadusIncarnate Forme TornadusTherian Forme ThundurusIncarnate Forme ThundurusTherian Forme Reshiram Zekrom LandorusIncarnate Forme LandorusTherian Forme Kyurem KyuremBlack Kyurem KyuremWhite Kyurem KeldeoOrdinary Forme KeldeoResolute Forme MeloettaAria Forme MeloettaPirouette Forme Genesect Chespin Quilladin Chesnaught Fennekin Braixen Delphox Froakie Frogadier Greninja Bunnelby Diggersby Fletchling Fletchinder Talonflame Scatterbug Spewpa Vivillon Litleo Pyroar Flabébé Floette Florges Skiddo Gogoat Pancham Pangoro Furfrou Espurr MeowsticMale MeowsticFemale Honedge Doublade AegislashBlade Forme AegislashShield Forme Spritzee Aromatisse Swirlix Slurpuff Inkay Malamar Binacle Barbaracle Skrelp Dragalge Clauncher Clawitzer Helioptile Heliolisk Tyrunt Tyrantrum Amaura Aurorus Sylveon Hawlucha Dedenne Carbink Goomy Sliggoo Goodra Klefki Phantump Trevenant PumpkabooAverage Size PumpkabooSmall Size PumpkabooLarge Size PumpkabooSuper Size GourgeistAverage Size GourgeistSmall Size GourgeistLarge Size GourgeistSuper Size Bergmite Avalugg Noibat Noivern Xerneas Yveltal Zygarde50% Forme Diancie DiancieMega Diancie HoopaHoopa Confined HoopaHoopa Unbound Volcanion"
selected_ids = tokenizer.encode(selected_tokens, return_tensors='pt')

In [52]:
from transformers import BeamSearchScorer
set_seeds(42)

beam_scorer = BeamSearchScorer(batch_size=1,
                               num_beams=num_beams,
                               device=model.device)

# instantiate logits processors
logits_processor = LogitsProcessorList([BoostLogitsProcessor(boost_ids=selected_ids, boost_value=2.67)])

beam_output = model.beam_search(new_input_tensor, 
                                beam_scorer, 
                                logits_processor=logits_processor, 
                                max_length=50, 
                                output_scores=True, 
                                output_hidden_states=True, 
                                return_dict_in_generate=True, 
                                number_return_sequences=3)

In [53]:
tokenizer.decode(beam_output['sequences'][0])

'Vulpic offers many direct integrations that allow you to build your own custom integrations, or integrate with existing ones.\n\nFor example, you can use Vulpic’ swarm mode to integrate with other services, or you can use Vul'