-
GitHub
- San Francisco
Starred repositories
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Vesuvius Challenge Ink Crackle Labeling Tool for Scroll Segmentations
aider is AI pair programming in your terminal
A small script that makes it easy to fling a folder of images onto the Samsung Frame TV
Run GPT model on the browser with WebGPU. An implementation of GPT inference in less than ~1500 lines of vanilla Javascript.
Virtual Unwrapping Tool for the Vesuvius Challenge
Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
Keeping language models honest by directly eliciting knowledge encoded in their activations.
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Transformer related optimization, including BERT, GPT
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
The easiest way to add pricing to your SaaS. Get billing over with.
FauxPilot - an open-source alternative to GitHub Copilot server
Synchronize text files between browser and disk using Yjs and the File System Access API
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.