<a href="https://colab.research.google.com/github/joexu22/clifford/blob/master/gpt_neo/GPTNeo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Me Setting Up GPTNeo

[GPTNeo](https://github.com/EleutherAI/GPTNeo) by [EleutherAI](eleuther.ai)

[Eleuther's Discord](https://discord.gg/BK2v3EJ) 

Setup Instructions

1. You need to turn on TPU Runtime
  - Go to "Runtime" -> Choose "Change runtime type" and select "TPU" under "hardware accelerator".

In [None]:
#@title Setup

%tensorflow_version 2.x
!git clone https://github.com/EleutherAI/GPTNeo
%cd GPTNeo
!pip3 install -q -r requirements.txt

Notes -

Using TPUs requires Cloud Filesystems

Using Google Cloud: https://console.cloud.google.com/

Making Google Cloud Bucket: https://console.cloud.google.com/storage

In [None]:
#@title Connect To Google Cloud Bucket - Step 1 (Authentication)

from google.colab import auth
auth.authenticate_user()
!gcloud init

In [None]:
#@title Use Cloud Bucket - Step 2 (Locate/Use Bucket Location)

path_to_cloud_bucket = 'gs://my-machine-learning-bucket' #@param {type:"string"}

In [None]:
#@title Use Dataset (Currently: Sampling Only)

import os
dataset = "Sampling_Only"
pass

In [None]:
#@title Sampling Only Configuration - Step 1 (CD to GPTNeo Directory)

# strange how they don't let me use %cd and %%writefile together
%cd /content/GPTNeo

In [None]:
#@title Sampling Only Configuration - Step 2 (Write File)

%%writefile configs/dataset_configs/Sampling_Only.json

{
  "path": "gs://my-machine-learning-bucket/datasets/Sampling_Only/Sampling_Only*.tfrecords",
  "eval_path": "",
  "n_vocab": 50256,
  "tokenizer_is_pretrained": true,
  "tokenizer_path": "gpt2",
  "eos_id": 50256,
  "padding_id": 50257
}


In [None]:
#@title Sampling Only Configuration - Step 3 (Write Model Config)

%%writefile configs/GPT3_2-7B.json

{
    "n_head": 16,
    "n_vocab": 50257,
    "embed_dropout": 0,
    "lr": 0.0002,
    "lr_decay": "cosine",
    "warmup_steps": 3000,
    "beta1": 0.9,
    "beta2": 0.95,
    "epsilon": 1e-8,
    "opt_name": "adam",
    "weight_decay": 0,
    "train_batch_size": 256,
    "attn_dropout": 0,
    "train_steps": 600000,
    "eval_steps": 0,
    "predict_steps": 1,
    "res_dropout": 0,
    "eval_batch_size": 4,
    "predict_batch_size": 1,
    "iterations": 100,
    "n_embd": 2048,
    "datasets": [["pile", null, null, null]],
    "model": "GPT",
    "model_path": "gs://my-machine-learning-bucket/GPT3_2-7B",
    "n_ctx": 2048,
    "n_layer": 24,
    "scale_by_depth": true,
    "scale_by_in": false,
    "attention_types" :  [[["global", "local"],12]],
    "mesh_shape": "x:4,y:2",
    "layout": "intermediate_expanded:x,heads:x,vocab:n_vocab,memory_length:y,embd:y",
    "activation_function": "gelu",
    "recompute_grad": true,
    "gradient_clipping": 1.0,
    "tokens_per_mb_per_replica": 2048,
    "precision": "bfloat16"
}

In [None]:
# @title Get Pretrained Models - Step 1 (Uncomment If Missing Weights)

pretrained_model = 'GPT3_2-7B' #@param ["GPT3_XL", "GPT3_2-7B"]

# !wget -m -np -c -U "eye02" -w 2 -R "index.html*" "https://the-eye.eu/public/AI/gptneo-release/$pretrained_model/"

path_to_local_weights = f"/content/GPTNeo/the-eye.eu/public/AI/gptneo-release/{pretrained_model}"

# URL = f"http://eaidata.bmk.sh/data/gptneo-release/{pretrained_model}/"
# FOLDER_NAME = "GPT3_XL"
# !curl $URL | grep -i "</a>" | sed -n 's/.*href="\([^"]*\).*/\1/p' | sed "s|^|$URL|" | xargs -n 1 -P 4 wget -P $pretrained_model
# path_to_local_weights = pretrained_model

In [None]:
# @title Get Pretrained Models - Step 2 (Uncomment If Missing Weights)

bucket_base = "gs://" + path_to_cloud_bucket.replace('gs://', '').split('/')[0]

# upload to your bucket
# !gsutil -m cp -r $path_to_local_weights $bucket_base

In [None]:
# @title Confirm Bucket
!gsutil ls $bucket_base

In [None]:
# @title Get The GPT3_2-7B Config

%cd /content/GPTNeo
!wget "https://the-eye.eu/public/AI/gptneo-release/GPT3_2-7B/config.json" -P configs/local_configs/

In [None]:
# @title Modify config for colab. 
  
import json
from pprint import pprint

path_to_model = "" #@param {type:"string"}
batch_size = 8 #@param {type:"integer"}
dset = ""  #@param {type:"string"}
mesh_shape = "x:4,y:2" #@param {type:"string"}
train_steps = 1000 #@param {type:"integer"}
steps_per_checkpoint = 500 #@param {type:"integer"}
start_step = 400000 if pretrained_model == "GPT3_2-7B" else 362000

if path_to_model == "":
  path_to_model = f'{bucket_base.strip("/")}/{pretrained_model}'
print(f'MODEL PATH: {path_to_model}\n')

if dset == "" and dataset != "Sampling_Only":
  dset = dataset
elif dataset is None and dset == "":
  dset = "pile"

def pad_to_multiple_of(n, mult):
  """
  pads n to a multiple of mult
  """
  extra = n % mult
  if extra > 0:
      n = n + mult - extra
  return n

with open(f'/content/GPTNeo/configs/local_configs/config.json', 'r') as f:
  data = json.load(f)
  pprint(data)
  dset_val = [[dset, None, None, None]] if dset != "" else data["datasets"]
  mods = {
          "mesh_shape": mesh_shape,
          "layout": "intermediate_expanded:x,heads:x,memory_length:y,embd:y",
          "model_path": path_to_model,
          "datasets": dset_val,
          "train_steps": start_step + train_steps,
          "eval_steps": 0,
          "train_batch_size": batch_size,
          "predict_batch_size": batch_size
        }
  data.update(mods)
  print('\n--->\n')
  pprint(data)
  with open(f'configs/{pretrained_model}.json', 'w') as outfile:
    json.dump(data, outfile, indent=2)

In [None]:
# @title Sample Text
%%writefile example_prompt.txt

So, GPTNeo. I have set you up.
What are your further instructions?

List Them Here:


In [None]:
# @title Quick Train
# !python3 main.py --model $pretrained_model --steps_per_checkpoint $steps_per_checkpoint --tpu colab

In [None]:
# @title Run Code
!python3 main.py --model $pretrained_model --steps_per_checkpoint 500 --tpu colab --predict --prompt example_prompt.txt