For part 2, I will be using a pre-trained model to test the different kind of results it gives by adjusting some parameters to see which gives more accurate results. I will also use two different prompts but one is longer than the other and see how that compares to the shorter one.

In [1]:
import pandas as pd
from transformers import AutoModelForCausalLM, AutoTokenizer

The model that I will be using is the DistilGPT2. This model is a version of the GPT-2 model and is used for similar cases but it has increased functionality of being smaller and easier to run than the base model. 

In [2]:
MODEL_NAME = 'distilgpt2'

#model = AutoModelWithLMHead.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

Downloading:   0%|          | 0.00/762 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/336M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

In [3]:
prompt = "He hasn't seen the old man since"
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")

Using the prompt above, we will generate two different outputs. The first will be with default values and the second one will have adjusted values (temperature=0.25) to compare which output is more accurate. 

In [4]:
prompt_length = len(tokenizer.decode(inputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True))
outputs = model.generate(inputs, max_length=20, do_sample=True, top_p=1, top_k=50, temperature=1, num_return_sequences=10)
outputs_2 = model.generate(inputs, max_length=20, do_sample=True, top_p=1, top_k=50, temperature=0.25, num_return_sequences=10)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


The generated text labeled as "normal" is the one using default values. Whereas the text labeled "testing" is the one using adjusted values. 



After doing some testing, if the temperature value is set to low values the generated text seems to stick with the same words. In this example, "...he was a kid. He's a good kid." was frequently used. For the generated text using default values, there was more variety with the words. We can conclude that the temperature parameter must be 1 or higher to get better results.

In [5]:
for i in range(len(outputs)):
    generated = tokenizer.decode(outputs[i])
    generated_2 = tokenizer.decode(outputs_2[i])
    print(generated + " <- normal")
    print(generated_2 + " <- testing")

He hasn't seen the old man since he was a kid. He just has a few days left <- normal
He hasn't seen the old man since he was a kid. He's a good kid. He <- testing
He hasn't seen the old man since he was born.

"I heard that the [ <- normal
He hasn't seen the old man since he was a kid.






 <- testing
He hasn't seen the old man since.










 <- normal
He hasn't seen the old man since the beginning of his career. He's been a big part <- testing
He hasn't seen the old man since he was 9-years-old.



 <- normal
He hasn't seen the old man since he was a kid, but he's been a great kid <- testing
He hasn't seen the old man since his death.








 <- normal
He hasn't seen the old man since he was a kid. He's been a good kid for <- testing
He hasn't seen the old man since the day he joined the army as a result of the war <- normal
He hasn't seen the old man since the last time he was in the White House, but he <- testing
He hasn't seen the old man since he died.” The man has his own f

Now, using what we learned, we can adjust the other parameters (top_p, top_k) to see if we can get a better result. Adjusting the top_p to 0.5 causes the same issue we encountered before where, "...he was a kid" phrase shows up often and also the sentences don't make sense. On the other hand, adjusting top_k to 25 doesn't have much impact but it is still worse than the default values. 

In [6]:
outputs_3 = model.generate(inputs, max_length=20, do_sample=True, top_p=0.5, top_k=50, temperature=1, num_return_sequences=10)
outputs_4 = model.generate(inputs, max_length=20, do_sample=True, top_p=1, top_k=25, temperature=1, num_return_sequences=10)
for i in range(len(outputs)):
    generated = tokenizer.decode(outputs[i])
    generated_2 = tokenizer.decode(outputs_3[i])
    generated_3 = tokenizer.decode(outputs_4[i])
    print(generated + " <- normal")
    print(generated_2 + " <- top_p")
    print(generated_3 + " <- top_k")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


He hasn't seen the old man since he was a kid. He just has a few days left <- normal
He hasn't seen the old man since he was a kid.






 <- top_p
He hasn't seen the old man since he left prison in 1996.





 <- top_k
He hasn't seen the old man since he was born.

"I heard that the [ <- normal
He hasn't seen the old man since he was younger, and he's not a bad guy," <- top_p
He hasn't seen the old man since it was taken away. There were tears at me and I <- top_k
He hasn't seen the old man since.










 <- normal
He hasn't seen the old man since the start of the season.





 <- top_p
He hasn't seen the old man since '08 and he's been working, he's not going <- top_k
He hasn't seen the old man since he was 9-years-old.



 <- normal
He hasn't seen the old man since he was younger.







 <- top_p
He hasn't seen the old man since this past season and he was the best at his job on <- top_k
He hasn't seen the old man since his death.








 <- normal
He hasn't seen the old man si

Now, lets generate a sentence using all adjusted parameters. We know what not to used so lets use something that will gives us good results. 

Using these parameters (top_p=0.9, top_k=75, temperature=2.0) causes more variaty in the use of words and, sentences that make more sense but can still see some words being misplaced. I think that has to do more with how the model was trained. 

In [7]:
outputs_4 = model.generate(inputs, max_length=20, do_sample=True, top_p=0.9, top_k=75, temperature=2.0, num_return_sequences=10)
for i in range(len(outputs_4)):
    generated = tokenizer.decode(outputs[i])
    generated_2 = tokenizer.decode(outputs_4[i])
    print(generated + " <- normal")
    print(generated_2 + " <- testing")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


He hasn't seen the old man since he was a kid. He just has a few days left <- normal
He hasn't seen the old man since you have his ass cut, but I haven't never tried <- testing
He hasn't seen the old man since he was born.

"I heard that the [ <- normal
He hasn't seen the old man since then.' Trump will surely follow to them through the coming elections <- testing
He hasn't seen the old man since.










 <- normal
He hasn't seen the old man since 1980, as recently as April this and again before in 1991 <- testing
He hasn't seen the old man since he was 9-years-old.



 <- normal
He hasn't seen the old man since January, either, although she remembers going to some hospital during <- testing
He hasn't seen the old man since his death.








 <- normal
He hasn't seen the old man since it began. To my mind (in one situation, after <- testing
He hasn't seen the old man since the day he joined the army as a result of the war <- normal
He hasn't seen the old man since the moment. Howev

We'll use the same parameter values we tested but on a longer prompt and see how it compares. 

In [8]:
# prompt_2 = "He hasn't seen the old man since he left to Italy to study abroad for"
prompt_2 = "The old photos made her conscious of her age, of how much time had passed - and of what an interesting life she'd had."
inputs = tokenizer.encode(prompt_2, add_special_tokens=False, return_tensors="pt")

In [9]:
prompt_length = len(tokenizer.decode(inputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True))
outputs = model.generate(inputs, max_length=45, do_sample=True, top_p=1, top_k=50, temperature=1, num_return_sequences=10)
outputs_2 = model.generate(inputs, max_length=45, do_sample=True, top_p=.9, top_k=75, temperature=2.0, num_return_sequences=10)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Giving it a longer prompt gives the model more to work with and therefore generates better well-put sentences. 

In [10]:
for i in range(len(outputs)):
    generated = tokenizer.decode(outputs[i])
    generated_2 = tokenizer.decode(outputs_2[i])
    print(generated + " <- normal")
    print(generated_2 + " <- testing")

The old photos made her conscious of her age, of how much time had passed - and of what an interesting life she'd had.



Maintained the knowledge that both the author and her family might have <- normal
The old photos made her conscious of her age, of how much time had passed - and of what an interesting life she'd had. Not in memory from this wonderful place on whose last night out at the funeral. That night <- testing
The old photos made her conscious of her age, of how much time had passed - and of what an interesting life she'd had. And of being so often involved in family matters in which she couldn't even be a parent <- normal
The old photos made her conscious of her age, of how much time had passed - and of what an interesting life she'd had. All too naturally, and at night she did her shopping herself. That day you look like <- testing
The old photos made her conscious of her age, of how much time had passed - and of what an interesting life she'd had.


But the photos, taken 