Skip to content

Commit

Permalink
Update docs (#2015)
Browse files Browse the repository at this point in the history
Address issue raised in #2008, made a new PR because I couldn't run
pre-commit for some reason
  • Loading branch information
CactiStaccingCrane committed Mar 11, 2023
1 parent c1d3a47 commit cfb3750
Show file tree
Hide file tree
Showing 8 changed files with 102 additions and 33 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Expand Up @@ -13,3 +13,6 @@ backend/openapi.json

# ignore jupyter notebook checkpoints
.ipynb_checkpoints

# edit docs using obsidian.md, these files should not appear in the repo
.obsidian/
7 changes: 6 additions & 1 deletion docs/docs/data/README.md
@@ -1,3 +1,8 @@
# Data

Resources related to data.
Resources related to data:

- [Data schemas](https://projects.laion.ai/Open-Assistant/docs/data/schemas)
- [Datasets](https://projects.laion.ai/Open-Assistant/docs/data/datasets)
- [Data augmentation](https://projects.laion.ai/Open-Assistant/docs/data/augmentation)
- [Supervised datasets](https://projects.laion.ai/Open-Assistant/docs/data/supervised-datasets)
6 changes: 5 additions & 1 deletion docs/docs/guides/README.md
@@ -1,3 +1,7 @@
# Guides

Useful guides to using [Open-Assistant](https://open-assistant.io/).
Useful guides for Open Assistant:

- [General guidelines for using open-assistant.io](https://projects.laion.ai/Open-Assistant/docs/guides/guidelines)
- [Example responses](https://projects.laion.ai/Open-Assistant/docs/guides/examples)
- [Developer guide, contains a lot of technical info](https://projects.laion.ai/Open-Assistant/docs/guides/developers)
95 changes: 77 additions & 18 deletions docs/docs/intro.md
@@ -1,15 +1,85 @@
# Introduction

OpenAssistant is a chat-based assistant that understands tasks, can interact
with third-party systems, and retrieve information dynamically to do so.
> The FAQ page is available at
> [here](https://projects.laion.ai/Open-Assistant/docs/faq).
It can be extended and personalized easily and is developed as free, open-source
software.
Open Assistant (abbreviated as OA) is a chat-based and open-source assistant.
The vision of the project is to make a large language model that can run on a
single high-end consumer GPU. With some modifications, Open Assistant should
also be able to interface with other third-party applications easily as well as
retrieve information from databases and the Internet.

## Our Vision
You should join the
[Open Assistant discord server](https://ykilcher.com/open-assistant-discord)
and/or comment on Github issues before making any major changes. Most dev
communcations take place on the Discord server. There are four main areas that
you can work on:

We want OpenAssistant to be the single, unifying platform that all other systems
use to interface with humans.
1. Ranking, labelling and making responses in
[open-assistant.io](https://www.open-assistant.io). You can take a look at
[tasks docs section](https://projects.laion.ai/Open-Assistant/docs/tasks) for
more information.
2. Curating datasets and performing data augmentation. This includes scraping,
gathering other public datasets, etc. Most of these efforts will be
concentrated at
[`/data/datasets`](https://github.com/LAION-AI/Open-Assistant/tree/main/data/datasets)
and are documented at
[here](https://projects.laion.ai/Open-Assistant/docs/data/datasets).
3. Creating and fine-tuning Open Assistant itself. For that, you should pay
special attention to
[`/model`](https://github.com/LAION-AI/Open-Assistant/tree/main/model).
4. [open-assistant.io](https://www.open-assistant.io) dev. Take a close look at
[`/website`](https://github.com/LAION-AI/Open-Assistant/tree/main/website) as
well as
[`/backend`](https://github.com/LAION-AI/Open-Assistant/tree/main/backend).

## GitHub folders explanation

> Do read the
> [developer guide](https://projects.laion.ai/Open-Assistant/docs/guides/developers)
> for further information.
Here's a list of first-level folders at
[Open Assistant's Github page](https://github.com/LAION-AI/Open-Assistant/).

- [`/ansible`](https://github.com/LAION-AI/Open-Assistant/tree/main/ansible) -
for managing the full stack using
[Ansible](<https://en.wikipedia.org/wiki/Ansible_(software)>)
- [`/assets`](https://github.com/LAION-AI/Open-Assistant/tree/main/assets) -
contains logos
- [`/backend`](https://github.com/LAION-AI/Open-Assistant/tree/main/backend) -
backend for open-assistant.io and discord bots, maybe helpful for locally test
API calls
- [`/copilot`](https://github.com/LAION-AI/Open-Assistant/tree/main/copilot) -
read more at AWS's [Copilot](https://aws.github.io/copilot-cli/). And no, this
is not a folder that contains something similar to OpenAI's Codex.
- [`/data`](https://github.com/LAION-AI/Open-Assistant/tree/main/data) -
contains
[`/data/datasets`](https://github.com/LAION-AI/Open-Assistant/tree/main/data/datasets)
that contains data scraping code and links to datasets on Hugging Face
- [`/deploy`](https://github.com/LAION-AI/Open-Assistant/tree/main/deploy)
- [`/discord-bot`](https://github.com/LAION-AI/Open-Assistant/tree/main/discord-bots) -
frontend as discord bots for volunteer data collection
- [`/docker`](https://github.com/LAION-AI/Open-Assistant/tree/main/docker)
- [`/docs`](https://github.com/LAION-AI/Open-Assistant/tree/main/docs) - this
website!
- [`/inference`](https://github.com/LAION-AI/Open-Assistant/tree/main/inference) -
inference pipeline for Open Assistant model
- [`/model`](https://github.com/LAION-AI/Open-Assistant/tree/main/inference) -
currently contains scripts and tools for training/fine-tuning Open Assistant
and other neural networks
- [\*`/notebooks`](https://github.com/LAION-AI/Open-Assistant/tree/main/inference) -
DEPRECATED in favor of\*
[`/data/datasets`](https://github.com/LAION-AI/Open-Assistant/tree/main/data/datasets).
Contains jupyter notebooks for data scraping and augmentation
- [`/oasst-shared`](https://github.com/LAION-AI/Open-Assistant/tree/main/oasst-shared) -
shared Python code for Open Assistant
- [`/scripts`](https://github.com/LAION-AI/Open-Assistant/tree/main/scripts) -
contains various scripts for things
- [`/text-frontend`](https://github.com/LAION-AI/Open-Assistant/tree/main/text-frontend)
- [`/website`](https://github.com/LAION-AI/Open-Assistant/tree/main/website) -
everything in [open-assistant.io](https://www.open-assistant.io), including
gamification

## Principles

Expand All @@ -21,14 +91,3 @@ use to interface with humans.
hardware
- We rapidly validate our ML experiments on a small scale, before going to a
supercluster

## Main Efforts

- Data Collection Code → Backend, website, and discord bot to collect data
- Instruction Dataset Gathering → Scraping & cleaning web data
- Gamification → Leaderboards & more, to make data collection more fun
- Model Training → Experiments on pseudo- and real-data
- Infrastructure → Collection, training, and inference
- Data Collection → This is the bulk of the work
- Data Augmentation → Making more data from little data
- Privacy and Safety → Protecting sensitive data
5 changes: 5 additions & 0 deletions docs/docs/presentations/README.md
@@ -1,3 +1,8 @@
# Presentations

Useful presentations that have been published about the project.

- [OpenAssistant Roadmap](https://docs.google.com/presentation/d/1n7IrAOVOqwdYgiYrXc8Sj0He8krn5MVZO_iLkCjTtu0/edit?usp=sharing):
High level vison and roadmap (December 2022).
- [OpenAssistant MVP](https://docs.google.com/presentation/d/1MXH5kJcew7h1aA9PBx2MirkEkjCBLnABbbrPsgbcyQg/edit?usp=sharing):
Goal: Crowd-Sourced Training Data Collection (January 2023).
6 changes: 0 additions & 6 deletions docs/docs/presentations/list.md

This file was deleted.

5 changes: 4 additions & 1 deletion docs/docs/research/README.md
@@ -1,3 +1,6 @@
# Research

Useful research material.
Useful research materials:

- [General](https://projects.laion.ai/Open-Assistant/docs/research/general)
- [Cohere Grounded QA](https://projects.laion.ai/Open-Assistant/docs/research/search-based-qa)
8 changes: 2 additions & 6 deletions docs/sidebars.js
Expand Up @@ -71,13 +71,9 @@ const sidebars = {
items: ["research/general", "research/search-based-qa"],
},
{
type: "category",
type: "doc",
label: "Presentations",
link: {
type: "doc",
id: "presentations/README",
},
items: ["presentations/list"],
id: "presentations/README",
},
{
type: "doc",
Expand Down

0 comments on commit cfb3750

Please sign in to comment.