Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Review Developer docs #4776

Merged
merged 13 commits into from
May 16, 2024
39 changes: 32 additions & 7 deletions argilla-frontend/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,9 @@ https://github.com/argilla-io/argilla/assets/1107111/49e28d64-9799-4cac-be49-19d

## 🚀 Quickstart

Argilla is an open-source data curation platform for LLMs. Using Argilla, everyone can build robust language models through faster data curation using both human and machine feedback. We provide support for each step in the MLOps cycle, from data labeling to model monitoring.
Argilla is an open-source data curation platform for LLMs. Using Argilla, everyone can build robust language models
through faster data curation using both human and machine feedback. We provide support for each step in the MLOps cycle,
from data labeling to model monitoring.

There are different options to get started:

Expand All @@ -65,6 +67,12 @@ There are different options to get started:

## 🖥️ FRONTEND

-Before running Argilla frontend server, you need to install Node version 18:
frascuchon marked this conversation as resolved.
Show resolved Hide resolved

```bash
-brew install node@18
frascuchon marked this conversation as resolved.
Show resolved Hide resolved
```

<h3>💣 Install dependencies</h3>

```bash
Expand All @@ -89,17 +97,33 @@ npm run generate

## 📏 Principles

- **Open**: Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can **use and combine your preferred libraries** without implementing any specific interface.
- **Open**: Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers,
spaCy, Stanford Stanza, Flair, etc.). In fact, you can **use and combine your preferred libraries** without
implementing any specific interface.

- **End-to-end**: Most annotation tools treat data collection as a one-off activity at the beginning of each project. In real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model goes into production, you want to monitor and analyze its predictions and collect more data to improve your model over time. Argilla is designed to close this gap, enabling you to **iterate as much as you need**.
- **End-to-end**: Most annotation tools treat data collection as a one-off activity at the beginning of each project. In
real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model
goes into production, you want to monitor and analyze its predictions and collect more data to improve your model over
time. Argilla is designed to close this gap, enabling you to **iterate as much as you need**.

- **User and Developer Experience**: The key to sustainable NLP solutions are to make it easier for everyone to contribute to projects. _Domain experts_ should feel comfortable interpreting and annotating data. _Data scientists_ should feel free to experiment and iterate. _Engineers_ should feel in control of data pipelines. Argilla optimizes the experience for these core users to **make your teams more productive**.
- **User and Developer Experience**: The key to sustainable NLP solutions are to make it easier for everyone to
contribute to projects. _Domain experts_ should feel comfortable interpreting and annotating data. _Data scientists_
should feel free to experiment and iterate. _Engineers_ should feel in control of data pipelines. Argilla optimizes
the experience for these core users to **make your teams more productive**.

- **Beyond hand-labeling**: Classical hand-labeling workflows are costly and inefficient, but having humans in the loop is essential. Easily combine hand-labeling with active learning, bulk-labeling, zero-shot models, and weak supervision in **novel** data annotation workflows\*\*.
- **Beyond hand-labeling**: Classical hand-labeling workflows are costly and inefficient, but having humans in the loop
is essential. Easily combine hand-labeling with active learning, bulk-labeling, zero-shot models, and weak supervision
in **novel** data annotation workflows\*\*.

## 🫱🏾‍🫲🏼 Contribute

We love contributors and have launched a [collaboration with JustDiggit](https://argilla.io/blog/introducing-argilla-community-growers) to hand out our very own bunds and help the re-greening of sub-Saharan Africa. To help our community with the creation of contributions, we have created our [developer](https://docs.argilla.io/en/latest/community/developer_docs.html) and [contributor](https://docs.argilla.io/en/latest/community/contributing.html) docs. Additionally, you can always [schedule a meeting](https://calendly.com/argilla-office-hours/30min) with our Developer Advocacy team so they can get you up to speed.
We love contributors and have launched
a [collaboration with JustDiggit](https://argilla.io/blog/introducing-argilla-community-growers) to hand out our very
own bunds and help the re-greening of sub-Saharan Africa. To help our community with the creation of contributions, we
have created our [developer](https://docs.argilla.io/en/latest/community/developer_docs.html)
and [contributor](https://docs.argilla.io/en/latest/community/contributing.html) docs. Additionally, you can
always [schedule a meeting](https://calendly.com/argilla-office-hours/30min) with our Developer Advocacy team so they
can get you up to speed.

## 🥇 Contributors

Expand All @@ -111,4 +135,5 @@ We love contributors and have launched a [collaboration with JustDiggit](https:/

## 🗺️ Roadmap

We continuously work on updating [our plans and our roadmap](https://github.com/orgs/argilla-io/projects/10/views/1) and we love to discuss those with our community. Feel encouraged to participate.
We continuously work on updating [our plans and our roadmap](https://github.com/orgs/argilla-io/projects/10/views/1) and
we love to discuss those with our community. Feel encouraged to participate.
67 changes: 15 additions & 52 deletions argilla-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,47 +32,21 @@
</a>
</p>

Argilla is a **collaboration platform for AI engineers and domain experts** that require **high-quality outputs, full data ownership, and overall efficiency**.
Argilla is a **collaboration platform for AI engineers and domain experts** that require **high-quality outputs, full
data ownership, and overall efficiency**.

This repository only contains developer info about the backend server. If you want to get started, we recommend taking a look at our [main repository](https://github.com/argilla-io/argilla) or our [documentation](https://docs.argilla.io/).
This repository only contains developer info about the backend server. If you want to get started, we recommend taking a
look at our [main repository](https://github.com/argilla-io/argilla) or our [documentation](https://docs.argilla.io/).

Are you a contributor or do you want to understand what is going on under the hood, please keep reading the documentation below.

## Clone repository

`argilla-server` is using `argilla` repository as submodule to build frontend statics so when cloning use the following command:

```sh
git clone --recurse-submodules git@github.com:argilla-io/argilla-server.git
```

If you already cloned the repository without using `--recurse-submodules` you can init and update the submodules with:

```sh
git submodule update --remote --recursive --init
```

> [!IMPORTANT]
> By default `argilla` submodule is using `develop` branch so the previous command will get the latest commit from that branch.

### Specify a tag for argilla submodule

When doing a release we should change `argilla` submodule to use an specific tag. In the following example we are setting tag `v1.22.0`:

```sh
cd argilla
git fetch --tags
git checkout v1.22.0
```

> [!NOTE]
> You should see some changes on the `argilla-server` root folder where the subproject commit is now changed to the one from the tag version. Feel free to commit these changes.
Are you a contributor or do you want to understand what is going on under the hood, please keep reading the
documentation below.

## Development environment

By default all commands executed with `pdm run` will get environment variables from `.env.dev` except command `pdm test` that will overwrite some of them using values coming from `.env.test` file.
By default all commands executed with `pdm run` will get environment variables from `.env.dev` except command `pdm test`
that will overwrite some of them using values coming from `.env.test` file.

These environment variables can be overrided if necessary so feel free to defined your own ones locally.
These environment variables can be override if necessary so feel free to defined your own ones locally.

### Run cli

Expand All @@ -82,37 +56,26 @@ pdm cli

### Run database migrations

By default a SQLite located at `~/.argilla/argilla.db` will be used. You can create the database and run migrations with the following custom PDM command:
By default a SQLite located at `~/.argilla/argilla.db` will be used. You can create the database and run migrations with
the following custom PDM command:

```sh
pdm migrate
```

### Run tests

A SQLite database located at `~/.argilla/argilla-test.db` will be automatically created to run tests. You can run the entire test suite using the following custom PDM command:
A SQLite database located at `~/.argilla/argilla-test.db` will be automatically created to run tests. You can run the
entire test suite using the following custom PDM command:

```sh
pdm test
```

## Run development server

### Build frontend static files

Before running Argilla development server we need to build the frontend static files. Node version 18 is required for this action:

```sh
brew install node@18
```

After that you can build the frontend static files:

```sh
./scripts/build_frontend.sh
```

After running the previous script you should have a folder at `src/argilla_server/static` with all the frontend static files successfully generated.
Note: If you need to run the frontend server you can follow the instructions at
the [argilla-frontend](/argilla-frontend/README.md) project

### Run uvicorn development server

Expand Down
8 changes: 0 additions & 8 deletions argilla-server/scripts/build_distribution.sh

This file was deleted.

7 changes: 0 additions & 7 deletions argilla-server/scripts/build_frontend.sh

This file was deleted.