Skip to content
/ CogKit Public

Finetuning and inference tools for the CogView4 and CogVideoX model series.

License

Notifications You must be signed in to change notification settings

THUDM/CogKit

Repository files navigation

CogKit

Introduction

CogKit is an open-source project that provides a user-friendly interface for researchers and developers to utilize ZhipuAI's CogView (image generation) and CogVideoX (video generation) models. It streamlines multimodal tasks such as text-to-image (T2I), text-to-video (T2V), and image-to-video (I2V). Users must comply with legal and ethical guidelines to ensure responsible implementation.

Visit our Docs to start.

Features

  • Fine-tuning Methods: Supports LoRA and full-parameter fine-tuning across various setups, including single-machine single-GPU, single-machine multi-GPU, and multi-machine multi-GPU configurations.
  • Inference: Provides an OpenAI-style API (T2I Only) and a command-line interface for seamless model deployment.
  • Embed Cache: Optimizes GPU memory usage to enhance efficiency during inference.

Roadmap

  • Add support for CogView4 ControlNet model
  • Docker for easy deployment

License

This project is licensed under the Apache 2.0 License.