# Install KerasNLP, Choose Backend and Import Dependencies

This examples uses [Keras Core](https://keras.io/keras_core/) to work in any of "tensorflow", "jax" or "torch". Support for Keras Core is baked into KerasNLP, simply change the "KERAS_BACKEND" environment variable to select the backend of your choice. We select the JAX backend below.

Source tutorial: https://keras.io/examples/generative/gpt2_text_generation_with_kerasnlp/

In [None]:
%pip install pip -U -q
%pip install tensorflow~=2.13.1 keras-nlp==0.6.2 -q

In [None]:
import os

os.environ["KERAS_BACKEND"] = "tensorflow"  # "jax"  # or "tensorflow" or "torch"

import keras_nlp
import tensorflow as tf
import keras_core as keras
import time

In [None]:
# cuda_malloc_async has fewer fragmentation issues than the default BFC memory allocator - https://docs.nvidia.com/deeplearning/frameworks/tensorflow-user-guide/index.html#tf_gpu_allocator

os.environ["TF_GPU_ALLOCATOR"] = "cuda_malloc_async"
print(os.getenv("TF_GPU_ALLOCATOR"))

# Load the model previously trained

In [None]:
gpt2_lm = keras.models.load_model("../models/gpt2_lm.keras")

# Into the Sampling Method
In KerasNLP, we offer a few sampling methods, e.g., contrastive search, Top-K and beam sampling. By default, our GPT2CausalLM uses Top-k search, but you can choose your own sampling method.

Much like optimizer and activations, there are two ways to specify your custom sampler:

Use a string identifier, such as "greedy", you are using the default configuration via this way.
Pass a [keras_nlp.samplers.Sampler](https://keras.io/api/keras_nlp/samplers/samplers#sampler-class) instance, you can use custom configuration via this way.

For more details on KerasNLP Sampler class, you can check the code [here](https://github.com/keras-team/keras-nlp/tree/master/keras_nlp/samplers).

# Finetune on Chinese Poem Dataset

We can also finetune GPT2 on non-English datasets. For readers knowing Chinese, this part illustrates how to fine-tune GPT2 on Chinese poem dataset to teach our model to become a poet!

Because GPT2 uses byte-pair encoder, and the original pretraining dataset contains some Chinese characters, we can use the original vocab to finetune on Chinese dataset.

In [None]:
# Load chinese poetry dataset.
!git clone https://github.com/chinese-poetry/chinese-poetry.git

Load text from the json file. We only use《全唐诗》for demo purposes.

In [None]:
import os
import json

poem_collection = []
for file in os.listdir("chinese-poetry/全唐诗"):
    if ".json" not in file or "poet" not in file:
        continue
    full_filename = "%s/%s" % ("chinese-poetry/全唐诗", file)
    with open(full_filename, "r") as f:
        content = json.load(f)
        poem_collection.extend(content)

paragraphs = ["".join(data["paragraphs"]) for data in poem_collection]

Let's take a look at sample data.

In [None]:
print(paragraphs[0])

Similar as Reddit example, we convert to TF dataset, and only use partial data to train.

In [None]:
train_ds = (
    tf.data.Dataset.from_tensor_slices(paragraphs)
    .batch(16)
    .cache()
    .prefetch(tf.data.AUTOTUNE)
)

# Running through the whole dataset takes long, only take `500` and run 1
# epochs for demo purposes.
train_ds = train_ds.take(500)
num_epochs = 1

learning_rate = keras.optimizers.schedules.PolynomialDecay(
    5e-4,
    decay_steps=train_ds.cardinality() * num_epochs,
    end_learning_rate=0.0,
)
loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
gpt2_lm.compile(
    optimizer=keras.optimizers.Adam(learning_rate),
    loss=loss,
    weighted_metrics=["accuracy"],
)

gpt2_lm.fit(train_ds, epochs=num_epochs)

Let's check the result! Copy the results into [Google Translate](https://translate.google.com/)

In [None]:
# "Red Hat is" translated to Chinese is "红帽是" using https://translate.google.com/
output = gpt2_lm.generate("红帽是", max_length=200)
print(output)

# Save the fine-tuned GPT-2 model to object storage

You can save the model in different formats depending on how you intend to serve the model. In short, this save will enable us to do early online experimentation with the pre-trained model.

In [None]:
# Local storage
 
gpt2_lm.save("../models/gpt2_lm.keras")
# gpt2_lm.save('../models/gpt2_lm.h5')

## Save to S3 Object Storage (Minio)

Lets use the NVIDIA Triton model folder structure to store the saved models

Triton model folder structure:

```
models (provide this dir as source / MODEL_REPOSITORY )
└─ [ model name ]
    └─ 1 (version)
        └── model.savedmodel (we will use .keras)
            ├── saved_model.pb
```

In [None]:
# install requirements

%pip install -U boto3 python-dotenv -q

In [None]:
# assuming Minio is deployed, populate the environment variables

!  echo "AWS_S3_BUCKET=${AWS_S3_BUCKET:-models}" > .env
!  echo "AWS_S3_ENDPOINT=${AWS_S3_ENDPOINT:-http://minio.minio.svc:9000}" >> .env
!  echo "AWS_ACCESS_KEY_ID=$(oc -n minio extract secret/minio-root-user --keys=MINIO_ROOT_USER --to=-)" >> .env
!  echo "AWS_SECRET_ACCESS_KEY=$(oc -n minio extract secret/minio-root-user --keys=MINIO_ROOT_PASSWORD --to=-)" >> .env

In [None]:
# import the packages

import os, boto3
from dotenv import load_dotenv

load_dotenv()

In [None]:
# upload the model from local storage to S3

local_path = "../models"
remote_path = "gpt2/3"

bucket = os.getenv("AWS_S3_BUCKET", "models")

s3 = boto3.client(
    "s3",
    endpoint_url=os.getenv("AWS_S3_ENDPOINT", "http://minio.minio.svc:9000"),
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID", "minioadmin"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY", "minioadmin"),
)


if bucket not in [bu["Name"] for bu in s3.list_buckets()["Buckets"]]:
    s3.create_bucket(Bucket=bucket)


def uploadDirectory(path, bucketname):
    for root, dirs, files in os.walk(path):
        for file in files:
            print(f"uploading: {file} to {bucket}/{remote_path}")
            s3.upload_file(
                os.path.join(root, file), bucketname, f"{remote_path}/{file}"
            )
            print("[ok]")


uploadDirectory(path=local_path, bucketname=bucket)