Stars
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Muzic: Music Understanding and Generation with Artificial Intelligence
Instant voice cloning by MIT and MyShell. Audio foundation model.
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Structured state space sequence models
Godot Gym API is an Open Source framework for using Godot3 game engine as 3d-environment for training reinforcement learning agents implemented in Python on any data, including images and point clo…
Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.net/forum?id=3jooF27-0Wy
This repository contains the official implementation of the research paper, "FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization" ICCV 2023
Accessible large language models via k-bit quantization for PyTorch.
QLoRA: Efficient Finetuning of Quantized LLMs
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
StarGAN v2 - Official PyTorch Implementation (CVPR 2020)
Image-to-Image Translation in PyTorch
StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
A Non-Euclidean Rendering Engine for 3D scenes.