superinsight-trainer-gpt

This is a python application that continuously look for finetunes to train by calling the SuperInsight FineTuning API. After traning has been completed the model can be exported to GCP bucket defined by your environment variable.

Prerequisite for running the trainer

The SuperInsight FineTuning API is setup and running, assign url under API_HOST in the environment variable
Include all the environment variables so the trainer can pull jobs from the API and train successfully
Host the application on a machine with GPU, depending on which base model you are using, different GPU might be required
To customize your wandb options, use the WANDB_ variables below or use the ones listed on wandb

Environment Variables

Variable	Usage	Required	Default
API_HOST	The API host that is used	True	None
NUM_GPUS	The number of GPUs used to train each job	True	1
NUM_TRAIN_EPOCS	The number of epochs to train for each job	True	1
PER_DEVICE_TRAIN_BATCH_SIZE	gradientAccumulationSteps used to train each job	True	1
PER_DEVICE_TRAIN_BATCH_SIZE	perDeviceTrainBatchSize used to train each job	True	2
EXPORT_GCP_STORAGE_BUCKET	If you like to export models to GCP bucket, include the bucket name here	False	None
EXPORT_GCP_STORAGE_FOLDER	If you like to export models to GCP bucket, include the bucket name here	False	None
GOOGLE_APPLICATION_CREDENTIALS	If you like to export models to GCP bucket, you will need to include your credentials	False	None
WANDB_API_KEY	The API Key for wandb	False	None
WANDB_NAME	The run name for wandb	False	None
WANDB_NOTES	Notes for wandb	False	None
WANDB_ENTITY	The entity name for wandb	False	None
WANDB_PROJECT	The project name for wandb	False	None
WANDB_MODE	wandb mode	False	None
WANDB_DISABLED	Disable wandb	False	True

Available Base Models

Here is a summary on base models and hardware that has been tested on so far.

Base Model ID	Hardware Tested On	Summary
gpt-neo-125m	NVIDIA V100 GPU	The `EleutherAI/gpt-neo-125M` model. Good option for testing.
gpt-neo-1.3b	NVIDIA V100 GPU	The `EleutherAI/gpt-neo-1.3B` model.
gpt-neo-2.7b	NVIDIA V100 GPU	The `EleutherAI/gpt-neo-2.7B` model.
gpt-j-6b	NVIDIA V100 GPU	The `EleutherAI/gpt-j-6B` model.
gpt-neox-20b	N/A	The `EleutherAI/gpt-neox-20b` model. Haven't tested on this yet.

Run Trainer with docker

Here are some sample commands on how you can get started by running the docker image directly

Run with gpu processors

docker run --gpus all --name superinsight-trainer-gpt superinsight/superinsight-trainer-gpt:latest

Run with env

docker run --gpus all --env API_HOST=https://finetuning.api.yourdomain --name superinsight-trainer-gpt superinsight/superinsight-trainer-gpt:latest

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
export		export
jobs		jobs
models		models
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ds_config.json		ds_config.json
main.py		main.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

superinsight-trainer-gpt

Prerequisite for running the trainer

Environment Variables

Available Base Models

Run Trainer with docker

Run with gpu processors

Run with env

About

Releases

Packages

Contributors 2

Languages

License

superinsight/superinsight-trainer-gpt

Folders and files

Latest commit

History

Repository files navigation

superinsight-trainer-gpt

Prerequisite for running the trainer

Environment Variables

Available Base Models

Run Trainer with docker

Run with gpu processors

Run with env

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages