Stars
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
A Library for Advanced Deep Time Series Models.
A unified framework for machine learning with time series
Everything you need to know to build your own RAG application
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
Toolkit for linearizing PDFs for LLM datasets/training
Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/
fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
SGLang is a fast serving framework for large language models and vision language models.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🔊 Text-Prompted Generative Audio Model
Composable building blocks to build Llama Apps
Get your documents ready for gen AI
Coding an LLM and its building blocks from scratch.
Machine Learning Journal for Intermediate to Advanced Topics.
RetroLLM: Empowering LLMs to Retrieve Fine-grained Evidence within Generation
Python tool for converting files and office documents to Markdown.
A data modelling layer built on top of polars and pydantic
Agent Framework / shim to use Pydantic with LLMs
Generate large synthetic data using an LLM
The repo associated with the Manning Publication
An API/Schema registry - stores APIs and Schemas.
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.