# Project 6 Starter Code (downloads/uses OpenAI's GPT-2 Model with 345M parameters)
### Dr. Sal Barbosa, Department of Computer Science, Middle Tennessee State University
> Some elements of this notebook are borrowed from the example by Denis Rothman in Transformers for Natural Language Processing. Packt Publishing Ltd, 2021.
> That example was built for Tensorflow version 1 and no longer works with Tensorflow 2.
> Some aspects of that example and the underlying model code from OpenAI's repository have been modified by me to address below listed errors:
> <li>Incompatibilities arising from code developed in Tensorflow v1 but running in Tensorflow v2</li>
> <li>Removal of the tf.contrib module from Tensorflow 2</li>
> <li>Lack of backward compatibility between the older HParams module in tf.contrib (Tensorflow v1) and the new hparams module (which is separately pip installed)</li>
> 
> 
> Code Repository:
> [OpenAI GPT-2 Repository](https://github.com/openai/gpt-2)
>
> Model Paper:
>[Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever,2019,'Language Models are Unsupervised Multitask Learners'](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)


### Load web proxy for TAMU FASTER system

In [None]:
import os
os.environ['http_proxy'] = 'http://10.72.8.25:8080'
os.environ['https_proxy'] = 'http://10.72.8.25:8080'

### Clone the OpenAI GPT-2 repository (comment out after first use)

In [None]:
#!git clone https://github.com/openai/gpt-2.git

### Replace files to address tensorflow incompatibilities (comment out after first use)

In [None]:
#!unzip -o gpt2_replacement_files.zip "gpt-2/src/*" -d .

### Change into the repository directory and pip install requirements and the updated <i>hparams</i> module (comment out pip installs after first use)
>##### If installing on a clean default container, messages suggesting adding certain elements to PATH and those referring to version incompatibilities may be disregarded.

In [None]:
import os
os.chdir("gpt-2")    
#!pip3 install -r requirements.txt
#!pip3 install hparams

### Import tensorflow and check its version (this currently issues some errors but everything works)

In [None]:
import tensorflow as tf
print(tf.__version__)

### Download the Pre-Trained Model (it is approximately 1.4 GB in size and may take a few minutes - comment out after first use)

In [None]:
#!python3 download_model.py '345M' 

### Set output to UTF encoded text

In [None]:
!export PYTHONIOENCODING=UTF-8

### Change to src directory

In [None]:
os.chdir("src")

### Import required modules

In [None]:
# General imports
import json
import numpy as np
import tensorflow as tf

### Import modules from the cloned repository. NOTE: The first time this notebook is run, it may not recognize some pip installed modules. <p><p>FIX: Ensure unnecessary downloads and installations are commented out and RESTART notebook kernel.

In [None]:
# Local imports (from the cloned repository)
import model, sample, encoder

### Function to interact with the model

In [None]:
# Model interaction function (modified from function in script "interactive_conditional_samples.py" and included in the notebook)
def interact_model(
    model_name,
    seed,
    nsamples,
    batch_size,
    length,
    temperature,
    top_k,
    models_dir
):
    models_dir = os.path.expanduser(os.path.expandvars(models_dir))
    if batch_size is None:
        batch_size = 1
    assert nsamples % batch_size == 0

    enc = encoder.get_encoder(model_name, models_dir)

    with open(os.path.join(models_dir, model_name, 'hparams.json')) as f:
        hparams = json.load(f)
        #print(hparams)

    if length is None:
        length = hparams['n_ctx'] // 2
    elif length > hparams['n_ctx']:
        raise ValueError("Can't get samples longer than window size: %s" % hparams['n_ctx'])

    with tf.compat.v1.Session(graph=tf.Graph()) as sess:
        context = tf.compat.v1.placeholder(tf.int32, [batch_size, None])
        np.random.seed(seed)
        tf.compat.v1.set_random_seed(seed)
        output = sample.sample_sequence(
            hparams=hparams, length=length,
            context=context,
            batch_size=batch_size,
            temperature=temperature, top_k=top_k
        )

        saver = tf.compat.v1.train.Saver()
        ckpt = tf.train.latest_checkpoint(os.path.join(models_dir, model_name))
        saver.restore(sess, ckpt)

        while True:
            raw_text = input("Model prompt >>> ")
            while not raw_text:
                print('Prompt should not be empty!')
                raw_text = input("Model prompt >>> ")
            context_tokens = enc.encode(raw_text)
            generated = 0
            for _ in range(nsamples // batch_size):
                out = sess.run(output, feed_dict={
                    context: [context_tokens for _ in range(batch_size)]
                })[:, len(context_tokens):]
                for i in range(batch_size):
                    generated += 1
                    text = enc.decode(out[i])
                    print("=" * 40 + " SAMPLE " + str(generated) + " " + "=" * 40)
                    print(text)
            print("=" * 80)

In [None]:
# To summarize prompts paste the prompt and add TL;DR: after the prompt
# To ask a question paste the prompt, add a space, and paste the question/answer text and enter.

# Prompt is excerpted from: Why Astronauts Are 'Stuck' on the International Space Station (https://www.usnews.com/news/national-news/articles/2024-08-07/why-astronauts-are-stuck-on-the-international-space-station)
prompt = '''
Boeing’s Starliner spacecraft carried two astronauts to the International Space Station roughly two months ago for its first crewed test flight. \
Now, their return to Earth is about seven weeks overdue with no return flight yet scheduled. The path home for Butch Wilmore and Suni Williams \
remains unclear, with NASA floating a few options on Wednesday at a press conference.  The Boeing Starliner spacecraft successfully docked to the \
International Space Station on June 6, a day after launching, but several thrusters shut down during its approach, pushing the docking more than \
an hour behind schedule. However, NASA announced Tuesday that it has delayed SpaceX’s Crew-9 launch to the space station to allow for "more time \ 
for mission managers to finalize return planning for the agency’s Boeing Crew Flight Test currently docked to the orbiting laboratory." If the \ 
astronauts don’t come home on Starliner, they would return to Earth on SpaceX’s Crew Dragon vehicle. 
'''
qa= '''
Q: What is the spacecraft called?
A: Starliner.
Q: How long was the docking delayed?
A: More than one hour.
Q: Has the astronauts' return been delayed?
A: Yes.
Q: What is the last name of the astronaut named Suni?
A: Williams.
Q: What is the name of the other astronaut flying with Suni?
A:
'''
# Prompt2 is excerpted from: 12 things that wowed us at the Paris Olympics (https://www.npr.org/2024/08/12/g-s1-16581/paris-olympics-best-moments)
prompt2 = '''
Before this summer, Steph Curry had achieved almost everything in basketball: four NBA titles, twice named NBA MVP, two-time gold \
medalist in the FIBA World Cup. But there was one big item missing from his resume: He'd never been to the Olympics. When he finally arrived \
in Paris, the 36-year-old was clearly determined to make the most of it. He went to root for other athletes. He traded pins and autographs. \
And by God he was going to win that Olympic gold. The U.S. men's basketball team was pushed to the limit in its semifinal game against Serbia. \
As his teammates struggled to find the basket, Curry made up the difference with nine three-pointers and 36 points overall. Then, two days  \
later, after France clawed back to make the gold medal match a three-point game with less than 3 minutes to go, Curry's vision turned gold. \
He hit four triples to put the game away as the French play-by-play announcer called him "the devil named Curry."
'''

qa2 = '''
Q: By the time the Olympic games began, what had Steph Curry not done?
A: Win a gold medal.
Q: Haw many times was Curry named Most Valuable Player?
A: Twice.
Q: What team made it difficult for the U.S. team to win?
A: Serbia.
Q: How many 3-point baskets did Steph Curry get?
A: Nine.
Q: Did he enjoy the Olympics?
A: Yes
Q: How did the game caller refer to Curry?
A:
'''


> [!WARNING AND DISCLAIMER]  
>This model is trained on over 40 Gigabytes of data. It is impossible to know all of that information or what the model output might be, given certain prompts.
> The model's ouput may be <b>offensive</b> in a number of ways, including (but not limited to) expressions of <b>racial, gender, or religious biases and stereotypes</b>, use of <b>profanity</b>, utterances of <b>violent rhetoric</b>, or language or depictions containing <b>strong sexual content</b>.
> The instructor will not induce generation of such material deliberately, and is not responsible for any such output.
### Prompt the Model to Generate Text.

In [None]:
# Interactively generate text with the GPT-2 model
interact_model(model_name='345M',seed=None,nsamples=5,batch_size=1,length=100,temperature=0.9,top_k=40,models_dir='../models')