Let's export the trained model to HuggingFace Hub in safetensors formats for compatibility with downstream inference engines. First, we'll define some variables.

In [11]:
model_name = "SuperCool-4x-Small"
checkpoint_path = "./checkpoints/checkpoint.pt"
exports_path = "./exports"

Then, we'll load the base model checkpoint into memory from disk.

In [None]:
import torch

from model import SuperCool

checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=True)

model = SuperCool(**checkpoint["model_args"])

model = torch.compile(model)

model.load_state_dict(checkpoint["model"])

print("Base checkpoint loaded successfully")

Since we employed a weight norm reparameterization of the model weights during training, we'll remove them before exporting the model weights.

In [13]:
model.remove_weight_norms()

Now, let's export the model in HuggingFace format so that it can be used with the HuggingFace ecosystem.

In [None]:
from os import path

hf_path = path.join(exports_path, model_name)

model.save_pretrained(hf_path)

print(f"Model saved to {hf_path}")

Lastly, we'll login to HuggingFaceHub and upload the model under our account.

In [None]:
from huggingface_hub import login

login(token="your-api-token")

model.push_to_hub(model_name)