InstructLab is a model-agnostic open source AI project that facilitates contributions to Large Language Models (LLMs).
We are on a mission to let anyone shape generative AI by enabling contributed updates to existing LLMs in an accessible way.
Our community welcomes all those who would like to help us enable everyone to shape the future of generative AI.
There are many projects rapidly embracing and extending permissively licensed AI models, but they are faced with three main challenges:
- Contribution to LLMs is not possible directly. They show up as forks, which forces consumers to choose a “best-fit” model that isn’t easily extensible. Also, the forks are expensive for model creators to maintain.
- The ability to contribute ideas is limited by a lack of AI/ML expertise. One has to learn how to fork, train, and refine models to see their idea move forward. This is a high barrier to entry.
- There is no direct community governance or best practice around review, curation, and distribution of forked models.
InstructLab is here to solve these problems.
The project enables community contributors to add additional "skills" or "knowledge" to a particular model.
InstructLab's model-agnostic technology gives model upstreams with sufficient infrastructure resources the ability to create regular builds of their open source licensed models not by rebuilding and retraining the entire model but by composing new skills into it.
Take a look at "lab-enhanced" models on the InstructLab Hugging Face page.
- Check out the Community README to get started with using and contributing to the project.
- You may wish to read through the project's FAQ to get more familiar with all aspects of InstructLab.
- If you want to jump right in, head to the
ilab
documentation to get InstructLab set up and running. - Learn more about the skills and knowledge you can add to models.
- You can find all the ways to collaborate with project maintainers and your fellow users of InstructLab beyond GitHub by visiting our project collaboration page.
- When you are ready to make a contribution to the project, please take a few minutes to look over our contribution guidelines to ensure your contribution is aligned with the project policies.
For folks getting started with all things InstructLab, it may be easiest for you to join one of our community meetings and speak with project maintainers and other InstructLab collaborators live. You can find details on all of our community meetings, including our open office hours each Thursday, in our detailed Project Meetings documentation.
Everyone is welcome and encouraged to attend if they will find value in joining. Please note that some meetings are recorded and the recordings published in our project YouTube channel. The meeting host will advise all attendees if the meeting is being recorded. If you prefer to join camera off or dial in via phone so as to not be actively recorded and/or you prefer not to be on camera during meetings, that is absolutely no problem.
Participation in all aspects of the InstructLab community (including but not limited to community meetings, mailing lists, real-time chat, and the project GitHub repos) is governed by our Code of Conduct.
See the project governance document for an overview of how InstructLab project operates.
Security policies and practices, including reporting vulnerabilities, can be found in our security document.
InstructLab 🐶 uses a novel synthetic data-based alignment tuning method for Large Language Models (LLMs.) The "lab" in InstructLab 🥼 stands for Large-Scale Alignment for ChatBots [1].
[1] Shivchander Sudalairaj*, Abhishek Bhandwaldar*, Aldo Pareja*, Kai Xu, David D. Cox, Akash Srivastava*. "LAB: Large-Scale Alignment for ChatBots", arXiv preprint arXiv: 2403.01081, 2024. (* denotes equal contributions)
The InstructLab project is sponsored by Red Hat.
InstructLab was originally created by engineers from Red Hat and IBM Research.
The infrastructure used to regularly train models based on new contributions from the community is donated and maintained by IBM.