Stars
📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) o…
Replace 'hub' with 'ingest' in any github url to get a prompt-friendly extract of a codebase
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
Free and Open Source Machine Translation API. Self-hosted, offline capable and easy to setup.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
A curated list of awesome information retrieval resources
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
Python tool for converting files and office documents to Markdown.
Hackable and optimized Transformers building blocks, supporting a composable construction.
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
An integration of Qdrant ANN vector database backend with Haystack
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Benchmarks of approximate nearest neighbor libraries in Python
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
👩🏻🍳 A collection of example notebooks
General technology for enabling AI capabilities w/ LLMs and MLLMs
Code and Checkpoints for "Generate rather than Retrieve: Large Language Models are Strong Context Generators" in ICLR 2023.
This repository contains various advanced techniques for Retrieval-Augmented Generation (RAG) systems.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
A library for efficient similarity search and clustering of dense vectors.
Open-source vector similarity search for Postgres
List of startups doing AI & ML
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)