##📚 Import pandas, numpy, and HuggingFace pipeline

In [13]:
import pandas as pd
import numpy as np
from transformers import pipeline


##📂 Load Robert Frost poems dataset

In [14]:
poems=pd.read_csv('/content/robert_frost_collection.csv')

##👀 Show first 5 rows of poems dataset

In [15]:
poems.head()

Unnamed: 0,Name,Content,Collection,Year of Publication
0,,,,
1,Stopping by Woods on a Snowy Evening,Whose woods these are I think I know. \nHis ...,New Hampshire,1923.0
2,Fire and Ice,"Some say the world will end in fire,\nSome say...",New Hampshire,1923.0
3,The Aim was Song,Before man came to blow it right\nThe wind onc...,New Hampshire,1923.0
4,The Need of Being Versed in Country Things,The house had gone to bring again\nTo the midn...,New Hampshire,1923.0


##📝 Get poem texts (drop missing values and convert to list

In [16]:
Content =poems["Content"].dropna().tolist()

##✂️ Split poems into individual non-empty lines and preview first 5

In [17]:
lines=[]
for poem in Content:
  for line in poem.split("\n"):
    lines.append(line.rstrip())

lines=[line for line in lines if len(line)>0]
lines[:5]


['Whose woods these are I think I know.',
 'His house is in the village though;',
 'He will not see me stopping here',
 'To watch his woods fill up with snow.',
 'My little horse must think it queer']

##🤖 Initialize text generation model

In [19]:
gen= pipeline("text-generation")

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


##👀 Show the first poem line

In [20]:
lines[0]

'Whose woods these are I think I know.'

##📝 Generate continuation of the first poem line (limit 20 words)

In [21]:
gen(lines[0],max_length=20)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': "Whose woods these are I think I know. I've never seen them, but they're really pretty. The trees are really beautiful, and the foliage is really nice. The weather's kind of nice. I've never seen some kind of cloud or fog, but I think there are some places that look like snow. I guess it's just snowing across the plains. I've never seen snow.\n\nDo you have any idea what kind of a wind tunnel you can put in your garage?\n\nWhat I don't know is how you do it. I think it's pretty simple. You put a box of ice cubes all over the driveway and you move it. You open that box and you take a bunch of ice cubes, put a couple of them in, and you put the others in. That way you can put the ice cubes in there and they just stay there for a while.\n\nYou're saying you don't have any idea what kind of a wind tunnel you can put in your garage?\n\nThat's a good question. What I do know is that I've been working on the wind tunnel for twenty-five years, and I've never seen anything l

##🔄 Generate 3 different continuations for the second poem line (max 30 words)

In [23]:
gen(lines[1],max_length=30, num_return_sequences=2)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'His house is in the village though; we have no idea what is going on. It is only after we return to the village that we can go to the village again."\n\nAs the sun rose above the horizon, it rose above the village again. "I hear a commotion," said the young man, "and it is a man named Asma who is doing what he is doing."\n\nThe young man was now talking with Asma, and as they spoke, the man was on the ground, talking to the children, and he could not tell them anything, save that he was lying on the ground. He was going to ask them to come to his house, but they refused, saying that they had never heard of the little man.\n\nAsma looked at Asma, and said, "It\'s Asma who is doing what he is doing."\n\n"Tell me, Asma, why do you do what you are doing?" asked the young man.\n\nAsma said, "It is because I know that it is my duty to protect you. I will not be able to do it without you knowing."\n\nAsma went to his house, and there was no one there. Then Asma went to th

##🛠️ Import textwrap for formatting long text

In [24]:
import textwrap

##📜 Define helper function to neatly wrap long text lines

In [30]:
def wrap(x):
  return textwrap.fill(x,replace_whitespace= False, fix_sentence_endings=True)

##🖨️ Generate continuation for first line and print wrapped text

In [31]:
out=gen(lines[0], max_length=30)
print(wrap(out[0]['generated_text']))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Whose woods these are I think I know.  But you'd think they'd come
back for a little while longer.  I'm not sure if I've ever seen
someone this close to me.  I'd say that's probably because of the way
I looked.  Maybe because of their mannerisms.  I don't know.  I think
there would be some connection to the way I looked.  I'd be looking
down on this place.  I'd find a way to show it off.  I would show that
I was beautiful and had a great personality.  I'd go crazy with it.  I
want to show that I was good at my job.  I'd be a lot more attractive
to other people.  I want to show that I was a good person.  I want to
look like it.  I get to do that.  I'm not a child-making person.  I'm
a craftsman.  I'm a metalworker.  I use tools I don't have.  I'm a
builder.  I build things.  I'm not a sculptor.  I don't do
metalworking.  I'm a painter.  I make things that I don't have.  I'd
never be able to do that.  I'd never be able to do that.  I would
never be able to do that.  I'm not a big craftsm

##💡 Give custom prompt and generate extended text (max 100 tokens)

In [27]:
prompt="transformers have a wide veriet of applications in nlp"
out=gen(prompt, max_length=100)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


##🖨️ Print the generated text in wrapped format

In [32]:
print(wrap(out[0]['generated_text']))

Whose woods these are I think I know.  But you'd think they'd come
back for a little while longer.  I'm not sure if I've ever seen
someone this close to me.  I'd say that's probably because of the way
I looked.  Maybe because of their mannerisms.  I don't know.  I think
there would be some connection to the way I looked.  I'd be looking
down on this place.  I'd find a way to show it off.  I would show that
I was beautiful and had a great personality.  I'd go crazy with it.  I
want to show that I was good at my job.  I'd be a lot more attractive
to other people.  I want to show that I was a good person.  I want to
look like it.  I get to do that.  I'm not a child-making person.  I'm
a craftsman.  I'm a metalworker.  I use tools I don't have.  I'm a
builder.  I build things.  I'm not a sculptor.  I don't do
metalworking.  I'm a painter.  I make things that I don't have.  I'd
never be able to do that.  I'd never be able to do that.  I would
never be able to do that.  I'm not a big craftsm

##💡 Generate and neatly print text continuation for custom NLP prompt

In [34]:
prompt= "transformers have a wide variety of applications in nlp"
out =gen(prompt, max_length=100)
print(wrap(out[0]['generated_text']))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


transformers have a wide variety of applications in nlp.  It is
possible to create and manipulate data files and also to manipulate
data in a manner similar to writing a program to read the XML file.
However, the same type of nlp can be used to read data files.  For
example, it can read the XML file "Hello World" and write it to
C:\Program Files\Microsoft Outlook.

The following example shows how
to read the data files in the "Hello World" file.

We can read data
files in the "Hello World" file from a file named "C:\Program
Files\Microsoft Outlook".

We can write data files in the "Hello
World" file into a file named "C:\Program Files\Microsoft Outlook".
The following example shows how to read the data files in the "Hello
World" file.

We can write data files in the "Hello World" file into a
file named "C:\Program Files\Microsoft Outlook".

We can write data
files in the "Hello World" file into a file named "C:\Program
Files\Microsoft Outlook".

We can write data files in the "Hello
Wo