# **Generation of Star Wars Text**

# Setup

### Install HuggingFace Transfomers library.

I use a slightly older version of huggingface's transformer library to guarantee the compatibility with the fine tuning script. 

The fine tuning can be performed using the run_language_modelling.py of the library, or with the one from the github repository, which I have modified to perform more evaluation during training. 

To run the fine tuning, it is necessary to load the training data in the colab environnement. 

In [1]:
!git clone https://github.com/huggingface/transformers


Cloning into 'transformers'...
remote: Enumerating objects: 35, done.[K
remote: Counting objects: 100% (35/35), done.[K
remote: Compressing objects: 100% (34/34), done.[K
remote: Total 64814 (delta 13), reused 2 (delta 0), pack-reused 64779[K
Receiving objects: 100% (64814/64814), 48.73 MiB | 29.56 MiB/s, done.
Resolving deltas: 100% (45936/45936), done.


In [2]:
# imports and utilities 
import os
os.chdir('/content/transformers')

# Use language modeling version as of April 21st.
!git checkout b1ff0b2ae7d368b7db3a8a8472a29cc195d278d8

!pip install .
!pip install -r ./examples/requirements.txt

os.chdir('/content/transformers/examples')


import torch
import run_language_modeling
import run_generation
import collections
import random
import numpy as np

from transformers import AutoConfig
from transformers import AutoTokenizer
from transformers import AutoModelWithLMHead
from transformers import GPT2Config
from transformers import GPT2LMHeadModel

def to_object(item):
        """
        Convert a dictionary to an object (recursive).
        """
        def convert(item): 
            if isinstance(item, dict):
                return type('jo', (), {k: convert(v) for k, v in item.items()})
            if isinstance(item, list):
                def yield_convert(item):
                    for index, value in enumerate(item):
                        yield convert(value)
                return list(yield_convert(item))
            else:
                return item

        return convert(item)

Note: checking out 'b1ff0b2ae7d368b7db3a8a8472a29cc195d278d8'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at b1ff0b2a Fix bug in examples: double wrap into DataParallel during eval
Processing /content/transformers
Collecting tokenizers==0.7.0rc7
[?25l  Downloading https://files.pythonhosted.org/packages/17/f6/c910fd504ded3072c5c810a5c1572c41e7cec5a5f7879d44be533ab881a4/tokenizers-0.7.0rc7-cp37-cp37m-manylinux1_x86_64.whl (5.6MB)
[K     |████████████████████████████████| 5.6MB 13.1MB/s 
[?25hCollecting boto3
[?25l  Downloading https://files.pythonhosted.org/packages/bd/c8/b5aac643697038ef6eb8c11c73b9ee9c2dc8cb2bc95cd


### Mount Google Drive

The Google Drive is used to save the model during and after training. 

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Launch fine-tuninng


Fine tuning is performed by calling the script from command line. The following hyperparameters can be modified : 

* `--num_train_epochs`: The number of times to iterate over the train set. 
* `--block_size`: Training text is truncated into blocks of this length.
* `--gradient_accumulation_steps`: Update the model weights every this many steps. Set this to >1 when the batch size is very small to improve training stability.
* `--output_dir`: This is the where checkpoints will get saved. 
* `--model_name_or_path` The path to the model weights to use when starting fine-tuning. 
* `--learning_rate` : learning rate of the Adam Optimizer



In [4]:
!python run_language_modeling.py \
    --output_dir='/content/drive/My Drive/finetuned_models/star_wars_final' \
    --model_type=gpt2 \
    --model_name_or_path=gpt2-medium \
    --save_total_limit=5 \
    --num_train_epochs=3.0 \
    --do_train \
    --evaluate_during_training \
    --logging_steps=500 \
    --save_steps=500 \
    --train_data_file=/content/star_wars_train.txt \
    --do_eval \
    --eval_data_file=/content/star_wars_valid.txt \
    --per_gpu_train_batch_size=2 \
    --per_gpu_eval_batch_size=2 \
    --block_size=512 \
    --gradient_accumulation_steps=5 \
    --learning_rate=1e-4 

2021-02-27 21:09:12.524875: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
02/27/2021 21:09:14 - INFO - transformers.configuration_utils -   loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-medium-config.json from cache at /root/.cache/torch/transformers/98aa65385e18b0efd17acd8bf64dcdf21406bb0c99c801c2d3c9f6bfd1f48f29.250d6dc755ccb17d19c7c1a7677636683aa35f0f6cb5461b3c0587bc091551a0
02/27/2021 21:09:14 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdro

### Compute perplexity of a dataset.

The following functions load a pretrained model either from the library or from a local directory. We can then compute the perplexity of a model. 

In [5]:
def load_model(args):
  """Creates a model and loads in weights for it."""
  config = AutoConfig.from_pretrained(args.model_name_or_path, cache_dir=None)
  #config = GPT2Config.from_pretrained(args.model_name_or_path, cache_dir=None)

  model = AutoModelWithLMHead.from_pretrained(
  #model = GPT2LMHeadModel.from_pretrained(
      args.model_name_or_path,
      
  )
  
  model.to(args.device)
       
  """model = AutoModelWithLMHead.from_pretrained(args.output_dir)
        tokenizer = AutoTokenizer.from_pretrained(args.output_dir)
        model.to(args.device)
      from_tf=bool(".ckpt" in args.model_name_or_path),
      config=config,
      cache_dir=None"""

    
  return model

def set_seed(args):
  """Set the random seed."""
  random.seed(args.seed)
  np.random.seed(args.seed)
  torch.manual_seed(args.seed)
  if args.n_gpu > 0:
    torch.cuda.manual_seed_all(args.seed)

def do_perplexity_eval(args, model, data_file_path):
  """Computes the perplexity of the text in data_file_path according to the provided model."""
  set_seed(args)

  args.eval_data_file=data_file_path

  tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, cache_dir=None)

  args.block_size = min(args.block_size, tokenizer.max_len)

  result = run_language_modeling.evaluate(args, model, tokenizer, prefix="")
  return result

In [7]:
# Set this to the checkpoint you want to evalute, or to "gpt2-medium" to
# evaluate the pre-trained model without finetuning.
CHECKPOINT_PATH = '/content/drive/MyDrive/finetuned_models/star_wars_final'
#CHECKPOINT_PATH = "gpt2-medium"

# Set this to the list of text files you want to evaluate the perplexity of.
DATA_PATHS = ["/content/star_wars_valid.txt",
              "/content/star_wars_test.txt"]

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
n_gpu = torch.cuda.device_count()
print("Running on device: ", device)

args = collections.defaultdict(
  model_name_or_path=CHECKPOINT_PATH,
  output_dir=CHECKPOINT_PATH,
  block_size = 512,
  local_rank=-1,
  eval_batch_size=2,
  per_gpu_eval_batch_size=2,
  n_gpu=n_gpu,
  mlm=False,
  device=device,
  line_by_line=False,
  overwrite_cache=None,
  model_type='gpt2',
  seed=42,
)
#args = DictToObj(args)
args = to_object(args)

model = load_model(args)

for data_path in DATA_PATHS:
  eval_results = do_perplexity_eval(args, model, data_path)
  perplexity = eval_results['perplexity']
  print('{} is the perplexity of {} according to {}'.format(
      perplexity, data_path, CHECKPOINT_PATH))

02/28/2021 11:34:45 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/28/2021 11:34:45 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/28/2021 11:34:45 - INFO - 

Running on device:  cuda


02/28/2021 11:35:02 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/28/2021 11:35:02 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/28/2021 11:35:02 - INFO - 

7.103740692138672 is the perplexity of /content/star_wars_valid.txt according to /content/drive/MyDrive/finetuned_models/star_wars_final


02/28/2021 11:35:24 - INFO - run_language_modeling -   Saving features into cached file /content/gpt2_cached_lm_512_star_wars_test.txt
02/28/2021 11:35:24 - INFO - run_language_modeling -   ***** Running evaluation  *****
02/28/2021 11:35:24 - INFO - run_language_modeling -     Num examples = 103
02/28/2021 11:35:24 - INFO - run_language_modeling -     Batch size = 2
Evaluating: 100%|██████████| 52/52 [00:25<00:00,  2.03it/s]
02/28/2021 11:35:49 - INFO - run_language_modeling -   ***** Eval results  *****
02/28/2021 11:35:49 - INFO - run_language_modeling -     perplexity = tensor(7.6790)


7.679045677185059 is the perplexity of /content/star_wars_test.txt according to /content/drive/MyDrive/finetuned_models/star_wars_final


### Generate samples
The following code generates text samples that are are continuations of a provided prompt.

In [6]:
def generate_samples(args, model, prompt_text):
  """Generating sampling for the provided prompt using the provided model."""

  print(args)
  set_seed(args)

  tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, cache_dir=None)

  requires_preprocessing = args.model_type in run_generation.PREPROCESSING_FUNCTIONS.keys()
  encoded_prompt = tokenizer.encode(prompt_text, add_special_tokens=False, return_tensors="pt")
  encoded_prompt = encoded_prompt.to(args.device)

  output_sequences = model.generate(
      input_ids=encoded_prompt,
      max_length=args.length + len(encoded_prompt[0]),
      temperature=args.temperature,
      top_k=args.k,
      top_p=args.p,
      repetition_penalty=args.repetition_penalty,
      do_sample=True,
      num_return_sequences=args.num_return_sequences,
  )

  # Remove the batch dimension when returning multiple sequences
  if len(output_sequences.shape) > 2:
    output_sequences.squeeze_()

  generated_sequences = []

  for generated_sequence_idx, generated_sequence in enumerate(output_sequences):
    generated_sequence = generated_sequence.tolist()

    # Decode text
    text = tokenizer.decode(generated_sequence, clean_up_tokenization_spaces=True)

    # Remove all text after the stop token
    text = text[: text.find(args.stop_token) if args.stop_token else None]

    # Remove the excess text that was used for pre-processing
    text = text[len(tokenizer.decode(encoded_prompt[0], clean_up_tokenization_spaces=True)) :]

    # Add the prompt at the beginning of the sequence.
    total_sequence = prompt_text + text

    generated_sequences.append(total_sequence)

  return generated_sequences

In [7]:
# Generation of original text from a prompt 

def generate_output(model_path, prompt):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    n_gpu = torch.cuda.device_count()
    #print("Running on device: ", device)

    CHECKPOINT_PATH = model_path
    PROMPT = prompt

    args = collections.defaultdict(
      model_name_or_path=CHECKPOINT_PATH,
      output_dir=CHECKPOINT_PATH,
      n_gpu=n_gpu,
      mlm=False,
      device=device,
      model_type='gpt2',
      seed=42,
      stop_token=None, # Set this if your dataset has a special word that indicates the end of a text.
      temperature=1.0,  # temperature sampling. Set this to temperature=1.0 to not use temperature.
      k=50,  # k for top-k sampling. Set this to k=0 to not use top-k.
      p=0.9,  # p for nucleus sampling. Set this to p=1.0 to not use nucleus sampling.
      repetition_penalty= 1.1,
      length=100,  # Number of tokens to generate.
      num_return_sequences=10,  # Number of independently computed samples to generate.
    )
    #args = DictToObj(args)
    args = to_object(args)

    model = load_model(args)
    print(args)
    sequences = generate_samples(args, model, PROMPT)
    for idx, sequence in enumerate(sequences):
      print('\n====== GENERATION {} ======'.format(idx))
      print(sequence)


In [8]:
# Set this to the checkpoint you want to use for generation, or to "gpt2-medium"
# to generate with the pre-trained model without finetuning.
CHECKPOINT_PATH = '/content/drive/MyDrive/finetuned_models/star_wars_final/'

PROMPT = 'Leia and Luke look to the sky, as a spaceship approaches.'

#generate_output( "gpt2-medium", PROMPT)
generate_output(CHECKPOINT_PATH, PROMPT)

PROMPT2 = "A long time ago, in a galaxy far, far away..."

generate_output(CHECKPOINT_PATH, PROMPT2)


02/28/2021 13:43:37 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/28/2021 13:43:37 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/28/2021 13:43:37 - INFO - 

<class '__main__.jo'>
<class '__main__.jo'>


02/28/2021 13:44:33 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/28/2021 13:44:33 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/28/2021 13:44:33 - INFO - 


Leia and Luke look to the sky, as a spaceship approaches.
     The ship lifts off like a rocket toward their location in space. Leia looks over at Han; he's also looking away from the craft.
HAN : (Cont'd) Where are we going?
LEIA : We're just passing through there -- I guess this is it! Look... where am??
THREEPIO laughs nervously. He puts his hand on an air filter that washes out the spray of blood left by Lando's decapitated head

Leia and Luke look to the sky, as a spaceship approaches.
LEIA: What's it over there?
PADMÉ
There's no way that could have happened... we're in trouble!  I'm sending you two back before this whole thing blows up!
EXTERIOR -- HOTH -- DARTH VADER'S CASTLE -- MAIN BRIDGE
Vader leads Poe down a long hallway where several guards guard droids are waiting for him. He stops suddenly on an upper level with his eyes dart

Leia and Luke look to the sky, as a spaceship approaches.  The ship's
light is very soft on their faces as they hold each other tight in an embra

02/28/2021 13:45:02 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/28/2021 13:45:02 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/28/2021 13:45:02 - INFO - 

<class '__main__.jo'>
<class '__main__.jo'>

A long time ago, in a galaxy far, far away...
PALPATINE (in hologram) If I may ask of you something personal. And it's important.... It might change your destiny. But be careful what is written in the dark side of the Force. You cannot keep yourself from my gaze forever. When times are difficult, when danger is threatening to everyone you love... take heart. There will always come a day when you must learn to trust me. Trust only that which pleases me. Protecting someone who does not deserve

A long time ago, in a galaxy far, far away...
LEIA:...My father once gave us this message to you. I hope we can return it safely home."
EXT-STARKILLER BASE - DAY
GENERAL GRIEVOUS stands and surveys the battle scene around him with some concern. THREEPIO is sitting by Leia's side -- not looking toward her at all.  He gets up slowly as he sees General Hux standing on the observation platform overlooking the trench of the Death Star.  His eyes light up

A 

In [9]:
PROMPT3 = 'Leia and Luke look to the sky, as a spaceship approaches. They cannot yet determine if it is a ship from the Resistance, or the First Order. \n BEN: Luke, Leia, I come in peace. Please allow me access to the base. \n LEIA:'

generate_output(CHECKPOINT_PATH, PROMPT3)

02/28/2021 14:03:12 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/28/2021 14:03:12 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/28/2021 14:03:12 - INFO - 

<class '__main__.jo'>
<class '__main__.jo'>





Leia and Luke look to the sky, as a spaceship approaches. They cannot yet determine if it is a ship from the Resistance, or the First Order. 
 BEN: Luke, Leia, I come in peace. Please allow me access to the base. 
 LEIA: What?!? Ben! The Queen asked you to go see her personally. To discuss this with our new Supreme Leader. So here we are now...and only then can she know how much your friendship extends. Let me explain...
PALPATINE
(continuing) No one will be able pinpoint where Skywalker left that trail until he returns alive. Only those who have followed him may do so, but they must stay hidden for their own protection. And please find an old friend of

Leia and Luke look to the sky, as a spaceship approaches. They cannot yet determine if it is a ship from the Resistance, or the First Order. 
 BEN: Luke, Leia, I come in peace. Please allow me access to the base. 
 LEIA: This isn't good. It seems our friend Han has been captured by the evil Empire.
THREEPIO
(continuing) There's nothin

In [1]:
PROMPT5 = "A long time ago, in a galaxy far, far away... The Galaxy is finally at peace, after the defeat of the First Order. Rey Skywalker is training a new generation of Jedi Knights, at the Academy on Tatooine. But the shadows of old enemies"
generate_output(CHECKPOINT_PATH, PROMPT5)

NameError: ignored

In [22]:
PROMPT4 = "Leia and Luke look to the sky, as a spaceship approaches. \n LUKE \n (over speaker) \n We've spotted that spaceship. \n Luke looks at the small yellow dot which the ship has crossed over, and thinks it through for a moment. He then puts his head in his hands."
generate_output(CHECKPOINT_PATH, PROMPT4)

02/27/2021 22:12:57 - INFO - transformers.configuration_utils -   loading configuration file /content/drive/MyDrive/finetuned_models/star_wars_final/config.json
02/27/2021 22:12:57 - INFO - transformers.configuration_utils -   Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "vocab_size": 50257
}

02/27/2021 22:12:57 - INFO - 

<class '__main__.jo'>
<class '__main__.jo'>





Leia and Luke look to the sky, as a spaceship approaches. 
 LUKE 
 (over speaker) 
 We've spotted that spaceship. 
 Luke looks at the small yellow dot which the ship has crossed over, and thinks it through for a moment. He then puts his head in his hands.
INT. MILLENNIUM FALCON - COCKPIT/SPACE
The Falcon sits on a large hangar floor surrounded by rows of tiny little craft. The two Millennium Falcon pilots chat away while several PILOTS pass out food from some hanging lids -- ANAKIN graciously does so. LANDO lets go with an awkward shake of Artoo's neck.
LANDo turns back around and gives Han Solo another hug.
LARK: So what brings you

Leia and Luke look to the sky, as a spaceship approaches. 
 LUKE 
 (over speaker) 
 We've spotted that spaceship. 
 Luke looks at the small yellow dot which the ship has crossed over, and thinks it through for a moment. He then puts his head in his hands.
LUCE: No no no! What are you doing here?
INT. CORUSCANT - MAIN HANGAR DECK - NIGHT cloaks pass with e