In [13]:
pip install -q gpt-2-simple
# This code cell uses the pip package manager to install the gpt-2-simple library quietly (-q flag suppresses output) in the 
# current Python environment.gpt-2-simple is a Python package that provides a simplified interface for interacting with the GPT-2 language 
# model developed by OpenAI. This library allows users to fine-tune, generate text, and perform other tasks using GPT-2 
# easily within their Python environment.

^C
Note: you may need to restart the kernel to use updated packages.


In [14]:
import gpt_2_simple as gpt2
from datetime import datetime

# This code cell imports the gpt_2_simple library under the alias gpt2. By using the as keyword, the library can be 
# referred to using the shorter alias gpt2 throughout the code, making it more concise. Additionally, it imports the 
# datetime module from the Python standard library. The datetime module provides classes for manipulating dates and 
# times in both simple and complex ways. In this case, it's likely imported for time-related functionalities such as 
# logging or tracking when certain operations are performed.

#from google.colab import files

In [3]:
gpt2.download_gpt2(model_name="124M")

#This code cell calls the download_gpt2() function from the gpt_2_simple library, which is used to download the GPT-2 model 
# specified by the model_name parameter. In this case, the model being downloaded is the 124M parameter version of GPT-2.
# GPT-2 models are available in various sizes, denoted by the number of parameters they contain. The 124M model is one of 
# the smaller versions of GPT-2, containing 124 million parameters.

#Downloading the model is necessary before using it for tasks such as text generation or fine-tuning on specific datasets. 
# Once downloaded, the model files are stored locally for future use.

Fetching checkpoint: 1.05Mit [00:00, 851Mit/s]                                                      
Fetching encoder.json: 1.05Mit [00:00, 3.42Mit/s]                                                   
Fetching hparams.json: 1.05Mit [00:00, ?it/s]                                                       
Fetching model.ckpt.data-00000-of-00001: 498Mit [01:04, 7.70Mit/s]                                  
Fetching model.ckpt.index: 1.05Mit [00:00, 773Mit/s]                                                
Fetching model.ckpt.meta: 1.05Mit [00:00, 4.86Mit/s]                                                
Fetching vocab.bpe: 1.05Mit [00:00, 4.84Mit/s]                                                      


In [5]:
file_name="C:\\Users\\lopez\\Documents\\ML Project\\shakespeare_sonnets_dataset.txt"

In [6]:
#This line initializes a TensorFlow session for running the GPT-2 model.
sess = gpt2.start_tf_sess()
# restore_from = set to fresh to start training from the base GPT-2 or set to latest to restart training from an existing checkpoint
# sample_every = number of steps to print example output
# print_every = number of steps to print training progress
# learning_rate = learning rate for the training (default: 1e-4, can lower to 1e-5 if you have
# run_name = subfolder within checkpoint to save the model
# overwrite = set to true if you want to continue finetuning an existing model without create duplicate copies

#This session will be used throughout the fine-tuning process.
gpt2.finetune(sess,
              dataset=file_name,
              model_name='124M',
              steps=100,
              restore_from='fresh',
              run_name='run1',
              print_every=10,
              sample_every=50,
              save_every=100)

#This code cell essentially starts a TensorFlow session, fine-tunes the GPT-2 model on a specified dataset for 100 steps, 
# and periodically prints the training progress and generates example output. The fine-tuned model checkpoints are saved 
# under the specified run_name subfolder.



Loading checkpoint models\124M\model.ckpt
INFO:tensorflow:Restoring parameters from models\124M\model.ckpt
Loading dataset...


100%|██████████| 1/1 [00:00<00:00,  6.12it/s]

dataset has 26012 tokens
Training...





[10 | 271.11] loss=4.16 avg=4.16
[20 | 558.87] loss=3.57 avg=3.86
[30 | 846.22] loss=3.30 avg=3.68
[40 | 1132.82] loss=3.01 avg=3.51
[50 | 1435.17] loss=2.44 avg=3.29
More I find that in these wastes I find nothing but thy smell
Which I cannot see in another's world,
To see, that I may taste thy self.
In these wastes I have not the knowledge
Which I may learn what I need, as in thy name;
I shall be as thy scythe in beauty doth appear;
And therefore thou art thine; for as in thy story, I must remain
For thee this story, which tells of thine own state.
The more I read of thy story, the more thou wilt,
With a more true esteem, and a more virtuous heart.<|endoftext|>

The story of youth will tell of thy youth,
And youth, like a flower, will show thee that
The stage hath shown thee this story.
Look what beauty you will grant, and what bounteousness
I require from your will, for I can no more grant
Than youth's sweet smell. May my love, that thou mightest enjoy
Than this flower that doth see

In [15]:
import gpt_2_simple as gpt2

sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess, reuse=True, checkpoint_dir='C:\\Users\\lopez\\Documents\\ML Project\\checkpoint')
# This function call loads a pre-trained GPT-2 model into the TensorFlow session.
#reuse=True: This parameter indicates whether to reuse variables from the loaded model. Setting it to True allows reusing the model for further tasks
#This code cell demonstrates loading a pre-trained GPT-2 model using the gpt_2_simple library. 

Loading checkpoint C:\Users\lopez\Documents\ML Project\checkpoint\run1\model-100
INFO:tensorflow:Restoring parameters from C:\Users\lopez\Documents\ML Project\checkpoint\run1\model-100


In [12]:
# Function to generate poem based on user input
def generate_poem(prompt):
    # Generate poem using user input as prefix
    poem = gpt2.generate(sess,
                         length=250,
                         temperature=0.9,
                         top_k=40,
                         top_p=0.9,
                         prefix=prompt,
                         truncate='<|endoftext|',
                         nsamples=1,
                         batch_size=1,
                         return_as_list=True)[0]
    return poem

# Prompt user for input
user_input = input("Enter a prompt to generate a poem: ")

# Generate poem based on user input
poem = generate_poem(user_input)

# Print the generated poem
print("Generated Poem:")
print(poem)

Generated Poem:
i hate you
To whom my verse graces your heart;
And all my poor love to-day unfathered
In an ever-dreary world of poverty,
Poor you, but not still. Dear me, thou art so,
That all my favourites, all my fresco's true,
Hath thee, and now I know thee well:
For me thee'sw ere thou were made free,
A slave to slavery, whose freedom was death:
Yet thou, whom I cast free, liveth, and deserve,
Like a free babe, in thee all my time.
Thus do I wish thee well, my dear, love;
And, dear, do not let that aspiring song,
Which tells the age's young to like to be gone,
Save what a true character thou wouldst have thee:
Then would I like thee, but in this my voice
I sing not to sell thee, but to give thee thee:
Thus would I live, though not in thee, but in thee were sung,
The barren tender whom thee doth give nursing:
For thou, whom I have given life to, live'st thou again;
And life, if
