This repository contains a curated (incomplete) list of open-source and finetuned Large Language Models.
LLaMA (Meta)
LLaMA (Large Language Model Meta AI), a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field.
- LLaMA Website: Introducing LLaMA: A foundational, 65-billion-parameter language model (facebook.com)
Alpaca (Stanford)
We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model. We train the Alpaca model on 52K instruction-following demonstrations generated in the style of self-instruct using text-davinci-003. On the self-instruct evaluation set, Alpaca shows many behaviors similar to OpenAI’s text-davinci-003, but is also surprisingly small and easy/cheap to reproduce.
Alpaca-LoRA
This repository contains code for reproducing the Stanford Alpaca results using low-rank adaptation (LoRA). We provide an Instruct model of similar quality to text-davinci-003 that can run on a Raspberry Pi (for research), and the code is easily extended to the 13b, 30b, and 65b models.
Baize
-
Paper: 2304.01196.pdf (arxiv.org)
Koala
-
GitHub: EasyLM/koala.md at main · young-geng/EasyLM (github.com)
-
Blog: Koala: A Dialogue Model for Academic Research — The Berkeley Artificial Intelligence Research Blog
Vicuna (FastChat)
We introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases. The cost of training Vicuna-13B is around $300. The code and weights, along with an online demo, are publicly available for non-commercial use.
- GitHub: lm-sys/FastChat: The release repo for “Vicuna: An Open Chatbot Impressing GPT-4” (github.com)
- Website: Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
llama.cpp
LLama.cpp, allows users to run the LLaMA model on their local computers using C/C++. According to the documentation, llama.cpp supports the following models and runs on moderately speed PCs:
LLaMA | Alpaca | GPT4All | Vicuna | Koala | OpenBuddy (Multilingual) | Pygmalion 7B / Metharme 7B
LLaMA-Adapter V2
Lit-LLaMA ️
StableVicuna
-
Website: Stability AI releases StableVicuna, the AI World’s First Open Source RLHF LLM Chatbot — Stability AI
-
Hugging Face: StableVicuna — a Hugging Face Space by CarperAI
StackLLaMA
StableLM (StabilityAI)
- Website: Stability AI Launches the First of its StableLM Suite of Language Models — Stability AI
- GitHub: Stability-AI/StableLM: StableLM: Stability AI Language Models (github.com)
- Hugging Face: Stablelm Tuned Alpha Chat — a Hugging Face Space by stabilityai
GPT4All
GTP4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs. The goal is simple - be the best instruction tuned assistant-style language model that any person or enterprise can freely use, distribute and build on.
-
GitHub: nomic-ai/pyllamacpp: Official supported Python bindings for llama.cpp + gpt4all (github.com)
GPT-J (EleutherAI)
GPT4All-J
GPT-NeoX (EleutherAI)
Pythia (EleutherAI)
- GitHub: EleutherAI/pythia (github.com)
Dolly 2.0 (Databricks)
Databricks’ Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. Based on pythia-12b, Dolly is trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.
- GutHub: dolly/data at master · databrickslabs/dolly (github.com)
- Blog post: Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
- Hugging Face: databricks (Databricks) (huggingface.co)
OpenAssistant Models
Open Assistant is a project meant to give everyone access to a great chat based large language model. We believe that by doing this we will create a revolution in innovation in language. In the same way that stable-diffusion helped the world make art and images in new ways we hope Open Assistant can help improve the world by improving language itself.
- GitHub: LAION-AI/Open-Assistant: OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
- Website: Open Assistant
Replit-Code (Replit)
- Hugging Face: https://huggingface.co/replit/replit-code-v1-3b
Segment Anything (Meta)
We aim to democratize segmentation by introducing the Segment Anything project: a new task, dataset, and model for image segmentation, as we explain in our research paper. We are releasing both our general Segment Anything Model (SAM) and our Segment Anything 1-Billion mask dataset (SA-1B), the largest ever segmentation dataset, to enable a broad set of applications and foster further research into foundation models for computer vision.
- GitHub: facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. (github.com)
- Website: Segment Anything
StartCoder (BigCode)
- Website: https://huggingface.co/bigcode
- Hugging Face: https://huggingface.co/spaces/bigcode/bigcode-editor and https://huggingface.co/spaces/bigcode/bigcode-playground
BLOOM (BigScience)
- Hugging Face: bigscience/bloom · Hugging Face
Flamingo (Google/Deepmind)
FLAN (Google)
FastChat-T5
- GitHub: lm-sys/FastChat: The release repo for “Vicuna: An Open Chatbot Impressing GPT-4” (github.com)
Flan-Alpaca
Commercial Use LLMs
Pythia | Dolly | Open Assistant (Pythia family) | GPT-J-6B | GPT-NeoX |
Bloom | StableLM-Alpha | FastChat-T5 |