From 8a318bc00c868e791d738320835ae0b7b060aae3 Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Mon, 6 May 2024 15:20:39 +0200 Subject: [PATCH 01/11] docs: Review Developer docs --- docs/_source/community/developer_docs.md | 110 ++++++++--------------- 1 file changed, 38 insertions(+), 72 deletions(-) diff --git a/docs/_source/community/developer_docs.md b/docs/_source/community/developer_docs.md index f8f6c8a2a4..dd46b9dda2 100644 --- a/docs/_source/community/developer_docs.md +++ b/docs/_source/community/developer_docs.md @@ -14,6 +14,13 @@ Being a developer in Argilla means that you are a part of the Argilla community - **Vue.js UI**: A web application to visualize and annotate your data, users, and teams. It is built with `Vue.js` and is directly deployed alongside the Argilla Server within our Argilla Docker image. +The Argilla repository has a mono repo structure, which means that all the components are in the same repository, and is divided into the following folders: + +- `argilla`: The python SDK project +- `argilla-frontend`: The Vue.js UI project +- `argilla-server`: The FastAPI server project +- `docs`: The documentation project + For a proper installation, you will need to: - [Set up the Documentation Environment](#set-up-the-documentation-environment), @@ -158,18 +165,21 @@ source .env/bin/activate Then, you just need to install Argilla with the command below. Note that we will install it in editable mode using the -e/--editable flag in the `pip` command to avoid having to re-install it on every code modification, but if you’re not planning to modify the code, you can just omit the -e/--editable flag. ```sh +cd argilla pip install -e . ``` Or installing just the `server` extra: ```sh +cd argilla pip install -e ".[server]" ``` Or you can install all the extras, which are also required to run the tests via pytest to make sure that the implemented features or the bug fixes work as expected, and that the unit/integration tests are passing. If you encounter any package or dependency problems, please consider upgrading or downgrading the related packages to solve the problem. ```sh +cd argilla pip install -e ".[server,listeners,postgresql,integrations,tests]" ``` @@ -271,108 +281,64 @@ rm ~/.argilla/argilla.db After deleting the database, you will need to run the [database migration](#run-database-migration) task. By following these steps, you’ll have a fresh and clean database to work with. -### Set up the Frontend +### Set up Argilla Server -If you want to work on the frontend of Argilla, you can do so by following the steps below. +If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](./argilla-server/README.md) file to see how to set up the server and run it on your local machine. -#### Clone the Argilla Repository +### Set up Argilla Frontend -Firstly, you have to [fork our repository and clone the fork](<(/community/contributing.md#work-with-a-fork)>) to your computer. +If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](./argilla-frontend/README.md) file to see how to set up the frontend and run it on your local machine. -```sh -git clone https://github.com/[your-github-username]/argilla.git -cd argilla -``` - -To keep your fork’s develop branch up to date with our repo you should add it as an [upstream remote branch](https://dev.to/louhayes3/git-add-an-upstream-to-a-forked-repo-1mik): - -```sh -git remote add upstream https://github.com/argilla-io/argilla.git -``` - -#### Build Frontend Static Files - -Build the static UI files in case you want to work on the UI: +## Make Your Contribution -```sh -bash scripts/build_frontend.sh -``` +Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to our [contributer guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it to the repository. -#### Run Frontend Files +### Run Tests -Run the Argilla backend using Docker with the following command: +#### Running Tests for the Argilla Python SDK +Running tests at the end of every development cycle is indispensable to make sure that there are no breaking changes. In your Argilla environment, you can run all the tests as follows: ```sh -docker run -d --name quickstart -p 6900:6900 argilla/argilla-quickstart:latest +cd argilla/ +pytest tests ``` -Navigate to the `frontend` folder from your project's root directory. - -Then, execute the command: +You can also run only the unit tests by providing the proper path: ```sh -npm run dev +cd argilla/ +pytest tests/unit ``` -To log in, use the username `admin` and the password `12345678`. If you need more information, please check [here](/getting_started/quickstart_installation.ipynb). - -### Set up the Server - -Before running the Argilla server, it is recommended to [build the frontend files](#build-frontend-static-files) to be able to access the UI on your local host. - -Then, to run Argilla backend, you will need an ElasticSearch instance up and running for the time being. You can get one running using Docker with the following command: +For running more heavy integration tests you can just run pytest with the `tests/integration` folder: ```sh -docker run -d --name elasticsearch-for-argilla -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" -e "discovery.type=single-node" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.5.3 +cd argilla/ +pytest tests/integration ``` -You will also need the vector database set up, as we show in the [Vector Database](#vector-database ) section. - +#### Running tests for the Argilla Server -#### Launch Argilla Server - -Now that your system has the Argilla backend server, you are ready to start your server and access Argilla. You can either use the CLI command, which uses the port 6900 and the host 0.0.0.0 as default. +To run the tests for the Argilla Server, you can use the following command: ```sh -argilla server start ARGILLA_ENABLE_TELEMETRY=0 +cd argilla-server/ +pdm test test/unit ``` -Or you can start the server through uvicorn, with the following command: +You can also set up a PostgreSQL database instead of the default sqlite backend: ```sh -ARGILLA_ENABLE_TELEMETRY=0 uvicorn argilla.server.app:app --port 6900 --host 0.0.0.0 --reload +cd argilla-server/ +ARGILLA_DATABASE_URL=postgresql://postgres:postgres@localhost:5432 pdm test tests/unit ``` -With this command, you will activate reloading the backend files after every change. This way, whenever you make a change and save it, it will automatically be reflected in your server. - -Note that we start the server with `ARGILLA_ENABLE_TELEMETRY=0` to stop anonymous reporting for our development environment. You can read more about telemetry settings on the [telemetry page](/reference/telemetry.md). +#### Running tests for the Argilla Frontend -## Make Your Contribution - -Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to our [contributer guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it to the repository. - -### Run Tests - -Running tests at the end of every development cycle is indispensable to make sure that there are no breaking changes. In your Argilla environment, you can run all the tests as follows: +To run the tests for the Argilla Frontend, you can use the following command: ```sh -pytest tests +cd argilla-frontend/ +npm run test ``` -You can also run only the unit tests by providing the proper path: - -```sh -pytest tests/unit -``` - -For the unit tests, you can also set up a PostgreSQL database instead of the default sqlite backend: - -```sh -ARGILLA_DATABASE_URL=postgresql://postgres:postgres@localhost:5432 pytest tests/unit -``` - -For running more heavy integration tests you can just run pytest with the `tests/integration` folder: - -```sh -pytest tests/integration -``` From d02e662701a6862a4c68d1b7288cfdd7eab58c74 Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 15:29:31 +0200 Subject: [PATCH 02/11] Apply suggestions from code review Co-authored-by: Natalia Elvira <126158523+nataliaElv@users.noreply.github.com> --- docs/_source/community/developer_docs.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/_source/community/developer_docs.md b/docs/_source/community/developer_docs.md index dd46b9dda2..4348b07be3 100644 --- a/docs/_source/community/developer_docs.md +++ b/docs/_source/community/developer_docs.md @@ -14,7 +14,7 @@ Being a developer in Argilla means that you are a part of the Argilla community - **Vue.js UI**: A web application to visualize and annotate your data, users, and teams. It is built with `Vue.js` and is directly deployed alongside the Argilla Server within our Argilla Docker image. -The Argilla repository has a mono repo structure, which means that all the components are in the same repository, and is divided into the following folders: +The Argilla repository has a monorepo structure, which means that all the components live in the same repository: `argilla-io/argilla`. This repo is divided into the following folders: - `argilla`: The python SDK project - `argilla-frontend`: The Vue.js UI project @@ -291,7 +291,7 @@ If you want to work on the frontend of Argilla, please visit the `argilla-fronte ## Make Your Contribution -Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to our [contributer guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it to the repository. +Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to our [contributor guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it to the repository. ### Run Tests From 83d24594836f43bc920332adacad74de1200daad Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 15:45:52 +0200 Subject: [PATCH 03/11] docs: Review developer docs --- docs/_source/community/developer_docs.md | 187 +++++++++++++++-------- 1 file changed, 121 insertions(+), 66 deletions(-) diff --git a/docs/_source/community/developer_docs.md b/docs/_source/community/developer_docs.md index 4348b07be3..9e6d74beed 100644 --- a/docs/_source/community/developer_docs.md +++ b/docs/_source/community/developer_docs.md @@ -1,25 +1,40 @@ # Developer Documentation -Being a developer in Argilla means that you are a part of the Argilla community and you are contributing to the development of Argilla. This page will guide you through the steps that you need to take to set up your development environment and start contributing to Argilla. Argilla is built upon different core components: +Being a developer in Argilla means that you are a part of the Argilla community, and you are contributing to the +development of Argilla. This page will guide you through the steps that you need to take to set up your development +environment and start contributing to Argilla. Argilla is built upon different core components: -- **Documentation**: The documentation for Argilla serves as an invaluable resource, providing a comprehensive and in-depth guide for users seeking to explore, understand, and effectively harness the core components of the Argilla ecosystem. +- **Documentation**: The documentation for Argilla serves as an invaluable resource, providing a comprehensive and +in-depth guide for users seeking to explore, understand, and effectively harness the core components of the Argilla +ecosystem. -- **Python SDK**: A Python SDK which is installable with `pip install argilla`, to interact with the Argilla Server and the Argilla UI. It provides an API to manage the data, configuration, and annotation workflows. +- **Python SDK**: A Python SDK which is installable with `pip install argilla`, to interact with the Argilla Server and +the Argilla UI. It provides an API to manage the data, configuration, and annotation workflows. -- **FastAPI Server**: The core of Argilla is a Python `FastAPI server` that manages the data, by pre-processing it and storing it in the vector database. Also, it stores application information in the relational database. It provides a REST API to interact with the data from the Python SDK and the Argilla UI. It also provides a web interface to visualize the data. +- **FastAPI Server**: The core of Argilla is a Python `FastAPI server` that manages the data, by pre-processing it and +storing it in the vector database. Also, it stores application information in the relational database. It provides a +REST API to interact with the data from the Python SDK and the Argilla UI. It also provides a web interface to visualize +the data. -- **Relational Database**: A relational database to store the metadata of the records and the annotations. `SQLite` is used as the default built-in option and is deployed separately with the Argilla Server but a separate `PostgreSQL` can be used too. +- **Relational Database**: A relational database to store the metadata of the records and the annotations. `SQLite` is +used as the default built-in option and is deployed separately with the Argilla Server but a separate `PostgreSQL` +can be used too. -- **Vector Database**: A vector database to store the records data and perform scalable vector similarity searches and basic document searches. We currently support `ElasticSearch` and `AWS OpenSearch` and they can be deployed as separate Docker images. +- **Vector Database**: A vector database to store the records data and perform scalable vector similarity searches and +basic document searches. We currently support `ElasticSearch` and `AWS OpenSearch` and they can be deployed as separate +Docker images. -- **Vue.js UI**: A web application to visualize and annotate your data, users, and teams. It is built with `Vue.js` and is directly deployed alongside the Argilla Server within our Argilla Docker image. +- **Vue.js UI**: A web application to visualize and annotate your data, users, and teams. It is built with `Vue.js` and +is directly deployed alongside the Argilla Server within our Argilla Docker image. -The Argilla repository has a monorepo structure, which means that all the components live in the same repository: `argilla-io/argilla`. This repo is divided into the following folders: +The Argilla repository has a monorepo structure, which means that all the components live in the same repository: +`argilla-io/argilla`. This repo is divided into the following folders: -- `argilla`: The python SDK project -- `argilla-frontend`: The Vue.js UI project -- `argilla-server`: The FastAPI server project -- `docs`: The documentation project +- [`argilla`](argilla): The python SDK project +- [`argilla-server`](argilla-server): The FastAPI server project +- [`argilla-frontend`](argilla-frontend): The Vue.js UI project +- [`docs`](docs): The documentation project +- [`examples`](examples): Example resources for deployments, scripts and notebooks For a proper installation, you will need to: @@ -33,11 +48,14 @@ And, you can start to [make your contribution](#make-your-contribution)! ## Set up the Documentation Environment -To kickstart your journey in contributing to Argilla, immersing yourself in the documentation is highly recommended. To do so, we recommend you create a virtual environment and follow the steps below. To build the documentation, a reduced set of dependencies is needed. +To kickstart your journey in contributing to Argilla, immersing yourself in the documentation is highly recommended. To +do so, we recommend you create a virtual environment and follow the steps below. To build the documentation, a reduced +set of dependencies is needed. ### Clone the Argilla Repository -First of all, you have to fork our repository and clone the fork to your computer. For more information, you can check our [guide](/community/contributing.md#work-with-a-fork). +First of all, you have to fork our repository and clone the fork to your computer. For more information, you can check +our [guide](/community/contributing.md#work-with-a-fork). ```sh git clone https://github.com/[your-github-username]/argilla.git @@ -60,19 +78,26 @@ To build the documentation, make sure you set up your system by installing the r pip install -r docs/_source/requirements.txt ``` -During the installation, you may encounter the following error: Microsoft Visual C++ 14.0 or greater is required. To solve it easily, check this [link](https://learn.microsoft.com/en-us/answers/questions/136595/error-microsoft-visual-c-14-0-or-greater-is-requir). +During the installation, you may encounter the following error: Microsoft Visual C++ 14.0 or greater is required. To +solve it easily, check this [link](https://learn.microsoft.com/en-us/answers/questions/136595/error-microsoft-visual-c-14-0-or-greater-is-requir). ### Build the documentation -To build the documentation, it is used [`sphinx`](https://www.sphinx-doc.org/en/master/),an open-source documentation generator, that is, it uses reStructuredText for writing documentation. Using Sphinx's command-line tool, it takes a collection of source files in plain text and generate them in HTML format. It also automatically creates a table of contents, index pages, and search features, enhancing navigation. To do so, the following files are required: +To build the documentation, it is used [`sphinx`](https://www.sphinx-doc.org/en/master/),an open-source documentation generator, that is, it uses +reStructuredText for writing documentation. Using Sphinx's command-line tool, it takes a collection of source files +in plain text and generate them in HTML format. It also automatically creates a table of contents, index pages, and +search features, enhancing navigation. To do so, the following files are required: -- **index.rst**: This serves as the main entry point for our documentation, accessible at the root URL. It typically includes a table of contents (using the toc trees), connecting users to other documentation sections. +- **index.rst**: This serves as the main entry point for our documentation, accessible at the root URL. It typically +includes a table of contents (using the toc trees), connecting users to other documentation sections. - **conf.py**: This file enables customization of the documentation's output. - **Makefile**: A crucial component provided by Sphinx, serving as the primary tool for local development. - **Other .rst files**: These are intended for specific subsections of the documentation. - **Markdown files**: The source files with plain text. -In our case, we rely on [`MyST-Parser`](https://myst-parser.readthedocs.io/en/latest/) to facilitate our work with Markdown. So, it's essential that when writing the documentation, we utilize [proper cross-references](https://docs.readthedocs.io/en/stable/guides/cross-referencing-with-sphinx.html) to connect various sections and documents. Below, you can find a typical illustration of commonly used cross-references: +In our case, we rely on [`MyST-Parser`](https://myst-parser.readthedocs.io/en/latest/) to facilitate our work with Markdown. So, it's essential that when writing +the documentation, we utilize [proper cross-references](https://docs.readthedocs.io/en/stable/guides/cross-referencing-with-sphinx.html) to connect various sections and documents. Below, you can +find a typical illustration of commonly used cross-references: ```md # To reference a previous section @@ -96,13 +121,17 @@ Reference [](my_target). - {doc}`Custom title ` ``` -So, once the documentation is written or fixed, if the installation was smooth, then use `sphinx-autobuild` to continuously deploy the webpage using the following command: +So, once the documentation is written or fixed, if the installation was smooth, then use `sphinx-autobuild` to +continuously deploy the webpage using the following command: ```sh sphinx-autobuild docs/_source docs/_build/html ``` -This will create a _build/html folder that is served at [http://127.0.0.1:8000](http://127.0.0.1:8000). Also, it starts watching for changes in the docs/source directory. When a change is detected in docs/source, the documentation is rebuilt and any open browser windows are reloaded automatically. Make sure that all files are indexed correctly. KeyboardInterrupt (ctrl+c) will stop the server. Below is an example of the server output running and stopping: +This will create a _build/html folder that is served at [http://127.0.0.1:8000](http://127.0.0.1:8000). Also, it starts watching for +changes in the docs/source directory. When a change is detected in docs/source, the documentation is rebuilt and any +open browser windows are reloaded automatically. Make sure that all files are indexed correctly. KeyboardInterrupt (ctrl+c) +will stop the server. Below is an example of the server output running and stopping: ```sh The HTML pages are in docs\_build\html. @@ -120,13 +149,16 @@ The HTML pages are in docs\_build\html. ## Set up the Development Environment -To work and develop for the core product of Argilla, you need to have all of Argilla's subsystem correctly running. In this section, we'll show how to install the Argilla package, the databases and the server. The frontend is optional and only required for running the UI, but you can also find how to run it here. +To work and develop for the core product of Argilla, you need to have all of Argilla's subsystem correctly running. In +this section, we'll show how to install the Argilla package, the databases and the server. The frontend is optional +and only required for running the UI, but you can also find how to run it here. ### Creating the Python Environment #### Clone the Argilla Repository -To set up your system for Argilla development, you, first of all, have to [fork](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) our repository and [clone](https://github.com/argilla-io/argilla) the fork to your computer. +To set up your system for Argilla development, you, first of all, have to [fork](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) our repository and [clone](https://github.com/argilla-io/argilla) +the fork to your computer. ```sh git clone https://github.com/[your-github-username]/argilla.git @@ -141,15 +173,20 @@ git remote add upstream https://github.com/argilla-io/argilla.git #### Install Dependencies -You will need to install `argilla` and the extra dependencies that you prefer to be able to use Argilla in your Python client or Command Line Interface (CLI). There are two ways to install it and you can opt for one of them depending on your use case: +You will need to install `argilla` and the extra dependencies that you prefer to be able to use Argilla in your Python +client or Command Line Interface (CLI). There are two ways to install it and you can opt for one of them depending on +your use case: -- Install `argilla` with `pip`: Recommended for non-extensive, one-time contributions as it will only install the required packages. +- Install `argilla` with `pip`: Recommended for non-extensive, one-time contributions as it will only install the +required packages. -- Install `argilla` with `conda`: Recommended for comprehensive, continuous contributions as it will create an all-inclusive environment for development. +- Install `argilla` with `conda`: Recommended for comprehensive, continuous contributions as it will create an +all-inclusive environment for development. ##### Install with `pip` -If you choose to install Argilla via `pip`, you can do it easily on your terminal. Firstly, direct to the `argilla` folder in your terminal by: +If you choose to install Argilla via `pip`, you can do it easily on your terminal. Firstly, direct to the `argilla` +folder in your terminal by: ```sh cd argilla @@ -162,51 +199,54 @@ python -m venv .env source .env/bin/activate ``` -Then, you just need to install Argilla with the command below. Note that we will install it in editable mode using the -e/--editable flag in the `pip` command to avoid having to re-install it on every code modification, but if you’re not planning to modify the code, you can just omit the -e/--editable flag. +Then, you just need to install Argilla with the command below. Note that we will install it in editable mode using the +-e/--editable flag in the `pip` command to avoid having to re-install it on every code modification, but if you’re not +planning to modify the code, you can just omit the -e/--editable flag. ```sh -cd argilla pip install -e . ``` Or installing just the `server` extra: ```sh -cd argilla pip install -e ".[server]" ``` -Or you can install all the extras, which are also required to run the tests via pytest to make sure that the implemented features or the bug fixes work as expected, and that the unit/integration tests are passing. If you encounter any package or dependency problems, please consider upgrading or downgrading the related packages to solve the problem. +Or you can install all the extras, which are also required to run the tests via pytest to make sure that the implemented +features or the bug fixes work as expected, and that the unit/integration tests are passing. If you encounter any package +or dependency problems, please consider upgrading or downgrading the related packages to solve the problem. ```sh -cd argilla pip install -e ".[server,listeners,postgresql,integrations,tests]" ``` ##### Install with `conda` -If you want to go with `conda` to install Argilla, firstly make sure that you have the latest version of conda on your system. You can go to the [anaconda page](https://conda.io/projects/conda/en/latest/user-guide/install/index.html#regular-installation) and follow the tutorial there to make a clean install of `conda` on your system. +If you want to go with `conda` to install Argilla, firstly make sure that you have the latest version of conda on your +system. You can go to the [anaconda page](https://conda.io/projects/conda/en/latest/user-guide/install/index.html#regular-installation) and follow the tutorial there to make a clean install of `conda` on +your system. -Make sure that you are in the argilla folder. - -```sh -cd argilla -``` - -Then, you can go ahead and create a new conda development environment, and then, activate it: +Make sure that you are in the argilla folder. Then, you can go ahead and create a new conda development environment, and +then, activate it: ```sh conda env create -f environment_dev.yml conda activate argilla ``` -In the new Conda environment, Argilla will already be installed in editable mode with all the server dependencies. But if you’re willing to install any other dependency you can do so via `pip` to install your own, or just see the available extras besides the `server` extras, which are: `listeners`, `postgresql`, and `tests`; all those installable as `pip install -e ".[]"`. +In the new Conda environment, Argilla will already be installed in editable mode with all the server dependencies. But +if you’re willing to install any other dependency you can do so via `pip` to install your own, or just see the available +extras besides the `server` extras, which are: `listeners`, `postgresql`, and `tests`; all those installable as +`pip install -e ".[]"`. -Now, the Argilla package is set up on your system and you need to make further installations for a thorough development setup. +Now, the Argilla package is set up on your system and you need to make further installations for a thorough development +setup. #### Install Code Formatting Tools -To keep a consistent code format, we use [pre-commit](https://pre-commit.com/) hooks. So, you first need to install `pre-commit` if not installed already, via pip as follows: +To keep a consistent code format, we use [pre-commit](https://pre-commit.com/) hooks. So, you first need to install `pre-commit` if not +installed already, via pip as follows: ```sh pip install pre-commit @@ -220,13 +260,19 @@ pre-commit install ### Set up the Databases -Argilla is built upon two databases: vector database and relational database. The vector database stores all the record data and is the component that performs scalable vector similarity searches as well as basic vector searches. On the other hand, the relational database stores the metadata of the records and annotations besides user and workspace information. +Argilla is built upon two databases: vector database and relational database. The vector database stores all the record +data and is the component that performs scalable vector similarity searches as well as basic vector searches. On the +other hand, the relational database stores the metadata of the records and annotations besides user and workspace +information. #### Vector Database -Argilla supports ElasticSearch and OpenSearch as its main search engine for the vector database. One of the two is required to correctly run Argilla in your development environment. +Argilla supports ElasticSearch and OpenSearch as its main search engine for the vector database. One of the two is +required to correctly run Argilla in your development environment. -To install Elasticsearch or Opensearch, and to work with Argilla on your server later, you first need to install Docker on your system. You can find the Docker installation guides for [Windows](https://docs.docker.com/desktop/install/windows-install/), [macOS](https://docs.docker.com/desktop/install/mac-install/) and [Linux](https://docs.docker.com/desktop/install/linux-install/) on Docker website. +To install Elasticsearch or Opensearch, and to work with Argilla on your server later, you first need to install Docker +on your system. You can find the Docker installation guides for [Windows](https://docs.docker.com/desktop/install/windows-install/), [macOS](https://docs.docker.com/desktop/install/mac-install/) and [Linux](https://docs.docker.com/desktop/install/linux-install/) on +Docker website. To install ElasticSearch or OpenSearch, you can refer to the [Setup and Installation](/getting_started/installation/deployments/docker.md) guide. @@ -235,7 +281,8 @@ Argilla supports ElasticSearch versions >=8.5, and OpenSearch versions >=2.4. ::: :::{note} -For vector search in OpenSearch, the filtering applied is using a `post_filter` step, since there is a bug that makes queries fail using filtering + knn from Argilla. +For vector search in OpenSearch, the filtering applied is using a `post_filter` step, since there is a bug that makes +queries fail using filtering + knn from Argilla. See https://github.com/opensearch-project/k-NN/issues/1286 This may result in unexpected results when combining filtering with vector search with this engine. @@ -243,13 +290,17 @@ This may result in unexpected results when combining filtering with vector searc #### Relational Database and Migration -Argilla will use SQLite as the default built-in option to store information about users, workspaces, etc. for the relational database. No additional configuration is required to start using SQLite. +Argilla will use SQLite as the default built-in option to store information about users, workspaces, etc. for the +relational database. No additional configuration is required to start using SQLite. -By default, the database file will be created at `~/.argilla/argilla.db`, this can be configured by setting different values for `ARGILLA_DATABASE_URL` and `ARGILLA_HOME_PATH` environment variables. +By default, the database file will be created at `~/.argilla/argilla.db`, this can be configured by setting different +values for `ARGILLA_DATABASE_URL` and `ARGILLA_HOME_PATH` environment variables. ##### Run Database Migration -Starting from Argilla 1.16.0, the data of the FeedbackDataset along with the user and workspace information are stored in an SQL database (SQLite or PostgreSQL). With each Argilla release, you may need to update the database schema to the newer version. Here, you can find how to do this database migration. +Starting from Argilla 1.16.0, the data of the FeedbackDataset along with the user and workspace information are stored +in an SQL database (SQLite or PostgreSQL). With each Argilla release, you may need to update the database schema to +the newer version. Here, you can find how to do this database migration. You can run database migrations by executing the following command: @@ -257,11 +308,14 @@ You can run database migrations by executing the following command: argilla server database migrate ``` -The default SQLite database will be created at `~/.argilla/argilla.db`. This can be changed by setting different values for `ARGILLA_DATABASE_URL` and `ARGILLA_HOME_PATH` environment variables. +The default SQLite database will be created at `~/.argilla/argilla.db`. This can be changed by setting different values +for `ARGILLA_DATABASE_URL` and `ARGILLA_HOME_PATH` environment variables. ##### Create the Default User -To run the Argilla database and server on your system, you should at least create the default user. Alternatively, you may skip a default user and directly create user(s) whose credentials you will set up. You can refer to the [user management](../getting_started/installation/configurations/user_management.md#create-a-user) page for detailed information. +To run the Argilla database and server on your system, you should at least create the default user. Alternatively, you +may skip a default user and directly create user(s) whose credentials you will set up. You can refer to the +[user management](../getting_started/installation/configurations/user_management.md#create-a-user) page for detailed information. To create a default user, you can run the following command: @@ -271,7 +325,9 @@ argilla server database users create_default ##### Recreate the Database -Occasionally, it may be necessary to recreate the database from scratch to ensure a clean state in your development environment. For instance, to run the Argilla test suite or troubleshoot issues that could be related to database inconsistencies. +Occasionally, it may be necessary to recreate the database from scratch to ensure a clean state in your development +environment. For instance, to run the Argilla test suite or troubleshoot issues that could be related to database +inconsistencies. First, you need to delete the Argilla database with the following command: @@ -279,66 +335,65 @@ First, you need to delete the Argilla database with the following command: rm ~/.argilla/argilla.db ``` -After deleting the database, you will need to run the [database migration](#run-database-migration) task. By following these steps, you’ll have a fresh and clean database to work with. +After deleting the database, you will need to run the [database migration](#run-database-migration) task. By following these steps, you’ll +have a fresh and clean database to work with. ### Set up Argilla Server -If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](./argilla-server/README.md) file to see how to set up the server and run it on your local machine. +If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](argilla-server/README.md) +file to see how to set up the server and run it on your local machine. ### Set up Argilla Frontend -If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](./argilla-frontend/README.md) file to see how to set up the frontend and run it on your local machine. +If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](argilla-frontend/README.md) +file to see how to set up the frontend and run it on your local machine. ## Make Your Contribution -Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to our [contributor guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it to the repository. +Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to +our [contributor guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it +to the repository. ### Run Tests #### Running Tests for the Argilla Python SDK -Running tests at the end of every development cycle is indispensable to make sure that there are no breaking changes. In your Argilla environment, you can run all the tests as follows: +Running tests at the end of every development cycle is indispensable to make sure that there are no breaking changes. In +your Argilla environment, you can run all the tests as follows (Under the argilla project folder) ```sh -cd argilla/ pytest tests ``` You can also run only the unit tests by providing the proper path: ```sh -cd argilla/ pytest tests/unit ``` For running more heavy integration tests you can just run pytest with the `tests/integration` folder: ```sh -cd argilla/ pytest tests/integration ``` #### Running tests for the Argilla Server -To run the tests for the Argilla Server, you can use the following command: +To run the tests for the Argilla Server, you can use the following command (Under the argilla project folder): ```sh -cd argilla-server/ pdm test test/unit ``` You can also set up a PostgreSQL database instead of the default sqlite backend: ```sh -cd argilla-server/ ARGILLA_DATABASE_URL=postgresql://postgres:postgres@localhost:5432 pdm test tests/unit ``` #### Running tests for the Argilla Frontend -To run the tests for the Argilla Frontend, you can use the following command: +To run the tests for the Argilla Frontend, you can use the following command (Under the argilla project folder): ```sh -cd argilla-frontend/ npm run test ``` - From bb4f0bdfc004f7ea6d5cc3fd7b85ac17c0dfcdc5 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue, 14 May 2024 13:47:22 +0000 Subject: [PATCH 04/11] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/_source/community/developer_docs.md | 130 +++++++++++------------ 1 file changed, 65 insertions(+), 65 deletions(-) diff --git a/docs/_source/community/developer_docs.md b/docs/_source/community/developer_docs.md index 9e6d74beed..9b7a5167c6 100644 --- a/docs/_source/community/developer_docs.md +++ b/docs/_source/community/developer_docs.md @@ -1,33 +1,33 @@ # Developer Documentation -Being a developer in Argilla means that you are a part of the Argilla community, and you are contributing to the -development of Argilla. This page will guide you through the steps that you need to take to set up your development +Being a developer in Argilla means that you are a part of the Argilla community, and you are contributing to the +development of Argilla. This page will guide you through the steps that you need to take to set up your development environment and start contributing to Argilla. Argilla is built upon different core components: -- **Documentation**: The documentation for Argilla serves as an invaluable resource, providing a comprehensive and -in-depth guide for users seeking to explore, understand, and effectively harness the core components of the Argilla +- **Documentation**: The documentation for Argilla serves as an invaluable resource, providing a comprehensive and +in-depth guide for users seeking to explore, understand, and effectively harness the core components of the Argilla ecosystem. -- **Python SDK**: A Python SDK which is installable with `pip install argilla`, to interact with the Argilla Server and +- **Python SDK**: A Python SDK which is installable with `pip install argilla`, to interact with the Argilla Server and the Argilla UI. It provides an API to manage the data, configuration, and annotation workflows. -- **FastAPI Server**: The core of Argilla is a Python `FastAPI server` that manages the data, by pre-processing it and -storing it in the vector database. Also, it stores application information in the relational database. It provides a -REST API to interact with the data from the Python SDK and the Argilla UI. It also provides a web interface to visualize +- **FastAPI Server**: The core of Argilla is a Python `FastAPI server` that manages the data, by pre-processing it and +storing it in the vector database. Also, it stores application information in the relational database. It provides a +REST API to interact with the data from the Python SDK and the Argilla UI. It also provides a web interface to visualize the data. -- **Relational Database**: A relational database to store the metadata of the records and the annotations. `SQLite` is -used as the default built-in option and is deployed separately with the Argilla Server but a separate `PostgreSQL` +- **Relational Database**: A relational database to store the metadata of the records and the annotations. `SQLite` is +used as the default built-in option and is deployed separately with the Argilla Server but a separate `PostgreSQL` can be used too. - **Vector Database**: A vector database to store the records data and perform scalable vector similarity searches and -basic document searches. We currently support `ElasticSearch` and `AWS OpenSearch` and they can be deployed as separate +basic document searches. We currently support `ElasticSearch` and `AWS OpenSearch` and they can be deployed as separate Docker images. - **Vue.js UI**: A web application to visualize and annotate your data, users, and teams. It is built with `Vue.js` and is directly deployed alongside the Argilla Server within our Argilla Docker image. -The Argilla repository has a monorepo structure, which means that all the components live in the same repository: +The Argilla repository has a monorepo structure, which means that all the components live in the same repository: `argilla-io/argilla`. This repo is divided into the following folders: - [`argilla`](argilla): The python SDK project @@ -48,13 +48,13 @@ And, you can start to [make your contribution](#make-your-contribution)! ## Set up the Documentation Environment -To kickstart your journey in contributing to Argilla, immersing yourself in the documentation is highly recommended. To +To kickstart your journey in contributing to Argilla, immersing yourself in the documentation is highly recommended. To do so, we recommend you create a virtual environment and follow the steps below. To build the documentation, a reduced set of dependencies is needed. ### Clone the Argilla Repository -First of all, you have to fork our repository and clone the fork to your computer. For more information, you can check +First of all, you have to fork our repository and clone the fork to your computer. For more information, you can check our [guide](/community/contributing.md#work-with-a-fork). ```sh @@ -78,25 +78,25 @@ To build the documentation, make sure you set up your system by installing the r pip install -r docs/_source/requirements.txt ``` -During the installation, you may encounter the following error: Microsoft Visual C++ 14.0 or greater is required. To +During the installation, you may encounter the following error: Microsoft Visual C++ 14.0 or greater is required. To solve it easily, check this [link](https://learn.microsoft.com/en-us/answers/questions/136595/error-microsoft-visual-c-14-0-or-greater-is-requir). ### Build the documentation -To build the documentation, it is used [`sphinx`](https://www.sphinx-doc.org/en/master/),an open-source documentation generator, that is, it uses -reStructuredText for writing documentation. Using Sphinx's command-line tool, it takes a collection of source files -in plain text and generate them in HTML format. It also automatically creates a table of contents, index pages, and +To build the documentation, it is used [`sphinx`](https://www.sphinx-doc.org/en/master/),an open-source documentation generator, that is, it uses +reStructuredText for writing documentation. Using Sphinx's command-line tool, it takes a collection of source files +in plain text and generate them in HTML format. It also automatically creates a table of contents, index pages, and search features, enhancing navigation. To do so, the following files are required: -- **index.rst**: This serves as the main entry point for our documentation, accessible at the root URL. It typically +- **index.rst**: This serves as the main entry point for our documentation, accessible at the root URL. It typically includes a table of contents (using the toc trees), connecting users to other documentation sections. - **conf.py**: This file enables customization of the documentation's output. - **Makefile**: A crucial component provided by Sphinx, serving as the primary tool for local development. - **Other .rst files**: These are intended for specific subsections of the documentation. - **Markdown files**: The source files with plain text. -In our case, we rely on [`MyST-Parser`](https://myst-parser.readthedocs.io/en/latest/) to facilitate our work with Markdown. So, it's essential that when writing -the documentation, we utilize [proper cross-references](https://docs.readthedocs.io/en/stable/guides/cross-referencing-with-sphinx.html) to connect various sections and documents. Below, you can +In our case, we rely on [`MyST-Parser`](https://myst-parser.readthedocs.io/en/latest/) to facilitate our work with Markdown. So, it's essential that when writing +the documentation, we utilize [proper cross-references](https://docs.readthedocs.io/en/stable/guides/cross-referencing-with-sphinx.html) to connect various sections and documents. Below, you can find a typical illustration of commonly used cross-references: ```md @@ -121,16 +121,16 @@ Reference [](my_target). - {doc}`Custom title ` ``` -So, once the documentation is written or fixed, if the installation was smooth, then use `sphinx-autobuild` to +So, once the documentation is written or fixed, if the installation was smooth, then use `sphinx-autobuild` to continuously deploy the webpage using the following command: ```sh sphinx-autobuild docs/_source docs/_build/html ``` -This will create a _build/html folder that is served at [http://127.0.0.1:8000](http://127.0.0.1:8000). Also, it starts watching for -changes in the docs/source directory. When a change is detected in docs/source, the documentation is rebuilt and any -open browser windows are reloaded automatically. Make sure that all files are indexed correctly. KeyboardInterrupt (ctrl+c) +This will create a _build/html folder that is served at [http://127.0.0.1:8000](http://127.0.0.1:8000). Also, it starts watching for +changes in the docs/source directory. When a change is detected in docs/source, the documentation is rebuilt and any +open browser windows are reloaded automatically. Make sure that all files are indexed correctly. KeyboardInterrupt (ctrl+c) will stop the server. Below is an example of the server output running and stopping: ```sh @@ -149,15 +149,15 @@ The HTML pages are in docs\_build\html. ## Set up the Development Environment -To work and develop for the core product of Argilla, you need to have all of Argilla's subsystem correctly running. In -this section, we'll show how to install the Argilla package, the databases and the server. The frontend is optional +To work and develop for the core product of Argilla, you need to have all of Argilla's subsystem correctly running. In +this section, we'll show how to install the Argilla package, the databases and the server. The frontend is optional and only required for running the UI, but you can also find how to run it here. ### Creating the Python Environment #### Clone the Argilla Repository -To set up your system for Argilla development, you, first of all, have to [fork](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) our repository and [clone](https://github.com/argilla-io/argilla) +To set up your system for Argilla development, you, first of all, have to [fork](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) our repository and [clone](https://github.com/argilla-io/argilla) the fork to your computer. ```sh @@ -173,19 +173,19 @@ git remote add upstream https://github.com/argilla-io/argilla.git #### Install Dependencies -You will need to install `argilla` and the extra dependencies that you prefer to be able to use Argilla in your Python -client or Command Line Interface (CLI). There are two ways to install it and you can opt for one of them depending on +You will need to install `argilla` and the extra dependencies that you prefer to be able to use Argilla in your Python +client or Command Line Interface (CLI). There are two ways to install it and you can opt for one of them depending on your use case: -- Install `argilla` with `pip`: Recommended for non-extensive, one-time contributions as it will only install the +- Install `argilla` with `pip`: Recommended for non-extensive, one-time contributions as it will only install the required packages. -- Install `argilla` with `conda`: Recommended for comprehensive, continuous contributions as it will create an +- Install `argilla` with `conda`: Recommended for comprehensive, continuous contributions as it will create an all-inclusive environment for development. ##### Install with `pip` -If you choose to install Argilla via `pip`, you can do it easily on your terminal. Firstly, direct to the `argilla` +If you choose to install Argilla via `pip`, you can do it easily on your terminal. Firstly, direct to the `argilla` folder in your terminal by: ```sh @@ -199,8 +199,8 @@ python -m venv .env source .env/bin/activate ``` -Then, you just need to install Argilla with the command below. Note that we will install it in editable mode using the --e/--editable flag in the `pip` command to avoid having to re-install it on every code modification, but if you’re not +Then, you just need to install Argilla with the command below. Note that we will install it in editable mode using the +-e/--editable flag in the `pip` command to avoid having to re-install it on every code modification, but if you’re not planning to modify the code, you can just omit the -e/--editable flag. ```sh @@ -213,8 +213,8 @@ Or installing just the `server` extra: pip install -e ".[server]" ``` -Or you can install all the extras, which are also required to run the tests via pytest to make sure that the implemented -features or the bug fixes work as expected, and that the unit/integration tests are passing. If you encounter any package +Or you can install all the extras, which are also required to run the tests via pytest to make sure that the implemented +features or the bug fixes work as expected, and that the unit/integration tests are passing. If you encounter any package or dependency problems, please consider upgrading or downgrading the related packages to solve the problem. ```sh @@ -223,8 +223,8 @@ pip install -e ".[server,listeners,postgresql,integrations,tests]" ##### Install with `conda` -If you want to go with `conda` to install Argilla, firstly make sure that you have the latest version of conda on your -system. You can go to the [anaconda page](https://conda.io/projects/conda/en/latest/user-guide/install/index.html#regular-installation) and follow the tutorial there to make a clean install of `conda` on +If you want to go with `conda` to install Argilla, firstly make sure that you have the latest version of conda on your +system. You can go to the [anaconda page](https://conda.io/projects/conda/en/latest/user-guide/install/index.html#regular-installation) and follow the tutorial there to make a clean install of `conda` on your system. Make sure that you are in the argilla folder. Then, you can go ahead and create a new conda development environment, and @@ -235,17 +235,17 @@ conda env create -f environment_dev.yml conda activate argilla ``` -In the new Conda environment, Argilla will already be installed in editable mode with all the server dependencies. But +In the new Conda environment, Argilla will already be installed in editable mode with all the server dependencies. But if you’re willing to install any other dependency you can do so via `pip` to install your own, or just see the available -extras besides the `server` extras, which are: `listeners`, `postgresql`, and `tests`; all those installable as +extras besides the `server` extras, which are: `listeners`, `postgresql`, and `tests`; all those installable as `pip install -e ".[]"`. -Now, the Argilla package is set up on your system and you need to make further installations for a thorough development +Now, the Argilla package is set up on your system and you need to make further installations for a thorough development setup. #### Install Code Formatting Tools -To keep a consistent code format, we use [pre-commit](https://pre-commit.com/) hooks. So, you first need to install `pre-commit` if not +To keep a consistent code format, we use [pre-commit](https://pre-commit.com/) hooks. So, you first need to install `pre-commit` if not installed already, via pip as follows: ```sh @@ -260,18 +260,18 @@ pre-commit install ### Set up the Databases -Argilla is built upon two databases: vector database and relational database. The vector database stores all the record -data and is the component that performs scalable vector similarity searches as well as basic vector searches. On the -other hand, the relational database stores the metadata of the records and annotations besides user and workspace +Argilla is built upon two databases: vector database and relational database. The vector database stores all the record +data and is the component that performs scalable vector similarity searches as well as basic vector searches. On the +other hand, the relational database stores the metadata of the records and annotations besides user and workspace information. #### Vector Database -Argilla supports ElasticSearch and OpenSearch as its main search engine for the vector database. One of the two is +Argilla supports ElasticSearch and OpenSearch as its main search engine for the vector database. One of the two is required to correctly run Argilla in your development environment. -To install Elasticsearch or Opensearch, and to work with Argilla on your server later, you first need to install Docker -on your system. You can find the Docker installation guides for [Windows](https://docs.docker.com/desktop/install/windows-install/), [macOS](https://docs.docker.com/desktop/install/mac-install/) and [Linux](https://docs.docker.com/desktop/install/linux-install/) on +To install Elasticsearch or Opensearch, and to work with Argilla on your server later, you first need to install Docker +on your system. You can find the Docker installation guides for [Windows](https://docs.docker.com/desktop/install/windows-install/), [macOS](https://docs.docker.com/desktop/install/mac-install/) and [Linux](https://docs.docker.com/desktop/install/linux-install/) on Docker website. To install ElasticSearch or OpenSearch, you can refer to the [Setup and Installation](/getting_started/installation/deployments/docker.md) guide. @@ -281,7 +281,7 @@ Argilla supports ElasticSearch versions >=8.5, and OpenSearch versions >=2.4. ::: :::{note} -For vector search in OpenSearch, the filtering applied is using a `post_filter` step, since there is a bug that makes +For vector search in OpenSearch, the filtering applied is using a `post_filter` step, since there is a bug that makes queries fail using filtering + knn from Argilla. See https://github.com/opensearch-project/k-NN/issues/1286 @@ -290,16 +290,16 @@ This may result in unexpected results when combining filtering with vector searc #### Relational Database and Migration -Argilla will use SQLite as the default built-in option to store information about users, workspaces, etc. for the +Argilla will use SQLite as the default built-in option to store information about users, workspaces, etc. for the relational database. No additional configuration is required to start using SQLite. -By default, the database file will be created at `~/.argilla/argilla.db`, this can be configured by setting different +By default, the database file will be created at `~/.argilla/argilla.db`, this can be configured by setting different values for `ARGILLA_DATABASE_URL` and `ARGILLA_HOME_PATH` environment variables. ##### Run Database Migration -Starting from Argilla 1.16.0, the data of the FeedbackDataset along with the user and workspace information are stored -in an SQL database (SQLite or PostgreSQL). With each Argilla release, you may need to update the database schema to +Starting from Argilla 1.16.0, the data of the FeedbackDataset along with the user and workspace information are stored +in an SQL database (SQLite or PostgreSQL). With each Argilla release, you may need to update the database schema to the newer version. Here, you can find how to do this database migration. You can run database migrations by executing the following command: @@ -308,13 +308,13 @@ You can run database migrations by executing the following command: argilla server database migrate ``` -The default SQLite database will be created at `~/.argilla/argilla.db`. This can be changed by setting different values +The default SQLite database will be created at `~/.argilla/argilla.db`. This can be changed by setting different values for `ARGILLA_DATABASE_URL` and `ARGILLA_HOME_PATH` environment variables. ##### Create the Default User -To run the Argilla database and server on your system, you should at least create the default user. Alternatively, you -may skip a default user and directly create user(s) whose credentials you will set up. You can refer to the +To run the Argilla database and server on your system, you should at least create the default user. Alternatively, you +may skip a default user and directly create user(s) whose credentials you will set up. You can refer to the [user management](../getting_started/installation/configurations/user_management.md#create-a-user) page for detailed information. To create a default user, you can run the following command: @@ -325,8 +325,8 @@ argilla server database users create_default ##### Recreate the Database -Occasionally, it may be necessary to recreate the database from scratch to ensure a clean state in your development -environment. For instance, to run the Argilla test suite or troubleshoot issues that could be related to database +Occasionally, it may be necessary to recreate the database from scratch to ensure a clean state in your development +environment. For instance, to run the Argilla test suite or troubleshoot issues that could be related to database inconsistencies. First, you need to delete the Argilla database with the following command: @@ -335,29 +335,29 @@ First, you need to delete the Argilla database with the following command: rm ~/.argilla/argilla.db ``` -After deleting the database, you will need to run the [database migration](#run-database-migration) task. By following these steps, you’ll +After deleting the database, you will need to run the [database migration](#run-database-migration) task. By following these steps, you’ll have a fresh and clean database to work with. ### Set up Argilla Server -If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](argilla-server/README.md) +If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](argilla-server/README.md) file to see how to set up the server and run it on your local machine. ### Set up Argilla Frontend -If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](argilla-frontend/README.md) +If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](argilla-frontend/README.md) file to see how to set up the frontend and run it on your local machine. ## Make Your Contribution -Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to -our [contributor guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it +Now that everything is up and running, you can start to develop and contribute to Argilla! You can refer to +our [contributor guide](/community/contributing.md) to have an understanding of how you can structure your contribution and upload it to the repository. ### Run Tests #### Running Tests for the Argilla Python SDK -Running tests at the end of every development cycle is indispensable to make sure that there are no breaking changes. In +Running tests at the end of every development cycle is indispensable to make sure that there are no breaking changes. In your Argilla environment, you can run all the tests as follows (Under the argilla project folder) ```sh From 8d2d42ac9dc84f02ec76970b4291209c772d6994 Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 16:06:50 +0200 Subject: [PATCH 05/11] chore: fix links --- docs/_source/community/developer_docs.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/_source/community/developer_docs.md b/docs/_source/community/developer_docs.md index 9b7a5167c6..a29b7574b3 100644 --- a/docs/_source/community/developer_docs.md +++ b/docs/_source/community/developer_docs.md @@ -30,11 +30,11 @@ is directly deployed alongside the Argilla Server within our Argilla Docker imag The Argilla repository has a monorepo structure, which means that all the components live in the same repository: `argilla-io/argilla`. This repo is divided into the following folders: -- [`argilla`](argilla): The python SDK project -- [`argilla-server`](argilla-server): The FastAPI server project -- [`argilla-frontend`](argilla-frontend): The Vue.js UI project -- [`docs`](docs): The documentation project -- [`examples`](examples): Example resources for deployments, scripts and notebooks +- [`argilla`](/argilla): The python SDK project +- [`argilla-server`](/argilla-server): The FastAPI server project +- [`argilla-frontend`](/argilla-frontend): The Vue.js UI project +- [`docs`](/docs): The documentation project +- [`examples`](/examples): Example resources for deployments, scripts and notebooks For a proper installation, you will need to: @@ -340,12 +340,12 @@ have a fresh and clean database to work with. ### Set up Argilla Server -If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](argilla-server/README.md) +If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](/argilla-server/README.md) file to see how to set up the server and run it on your local machine. ### Set up Argilla Frontend -If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](argilla-frontend/README.md) +If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](/argilla-frontend/README.md) file to see how to set up the frontend and run it on your local machine. ## Make Your Contribution From c4356935751f0f4c8466c42558cba2fef50352df Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue, 14 May 2024 14:07:21 +0000 Subject: [PATCH 06/11] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/_source/community/developer_docs.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/_source/community/developer_docs.md b/docs/_source/community/developer_docs.md index a29b7574b3..888249564e 100644 --- a/docs/_source/community/developer_docs.md +++ b/docs/_source/community/developer_docs.md @@ -340,12 +340,12 @@ have a fresh and clean database to work with. ### Set up Argilla Server -If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](/argilla-server/README.md) +If you want to work on the server of Argilla, please visit the `argilla-server` [README.md](/argilla-server/README.md) file to see how to set up the server and run it on your local machine. ### Set up Argilla Frontend -If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](/argilla-frontend/README.md) +If you want to work on the frontend of Argilla, please visit the `argilla-frontend` [README.md](/argilla-frontend/README.md) file to see how to set up the frontend and run it on your local machine. ## Make Your Contribution From c9f0d878f38d8df38c9d5f123ec20b95eac69f50 Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 16:19:00 +0200 Subject: [PATCH 07/11] chore: Remove unused files --- argilla-server/scripts/build_distribution.sh | 8 -------- argilla-server/scripts/build_frontend.sh | 7 ------- 2 files changed, 15 deletions(-) delete mode 100755 argilla-server/scripts/build_distribution.sh delete mode 100755 argilla-server/scripts/build_frontend.sh diff --git a/argilla-server/scripts/build_distribution.sh b/argilla-server/scripts/build_distribution.sh deleted file mode 100755 index 810ded14ca..0000000000 --- a/argilla-server/scripts/build_distribution.sh +++ /dev/null @@ -1,8 +0,0 @@ -#!/usr/bin/env bash -set -e - -BASEDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" - -$BASEDIR/build_frontend.sh - -rm -rf dist && pdm build diff --git a/argilla-server/scripts/build_frontend.sh b/argilla-server/scripts/build_frontend.sh deleted file mode 100755 index 4c517672c9..0000000000 --- a/argilla-server/scripts/build_frontend.sh +++ /dev/null @@ -1,7 +0,0 @@ -#!/usr/bin/env bash - -cd argilla/frontend \ -&& npm install \ -&& npm run-script lint \ -&& npm run-script test \ -&& BASE_URL=@@baseUrl@@ DIST_FOLDER=../../src/argilla_server/static npm run-script build \ From 52d7d9ac39feadfaee7728981ae6fbae3b438dcc Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 16:19:13 +0200 Subject: [PATCH 08/11] chore: Update readme files --- argilla-frontend/README.md | 39 ++++++++++++++++++---- argilla-server/README.md | 67 +++++++++----------------------------- 2 files changed, 47 insertions(+), 59 deletions(-) diff --git a/argilla-frontend/README.md b/argilla-frontend/README.md index cf8039f6b6..f95b28006a 100644 --- a/argilla-frontend/README.md +++ b/argilla-frontend/README.md @@ -53,7 +53,9 @@ https://github.com/argilla-io/argilla/assets/1107111/49e28d64-9799-4cac-be49-19d ## 🚀 Quickstart -Argilla is an open-source data curation platform for LLMs. Using Argilla, everyone can build robust language models through faster data curation using both human and machine feedback. We provide support for each step in the MLOps cycle, from data labeling to model monitoring. +Argilla is an open-source data curation platform for LLMs. Using Argilla, everyone can build robust language models +through faster data curation using both human and machine feedback. We provide support for each step in the MLOps cycle, +from data labeling to model monitoring. There are different options to get started: @@ -65,6 +67,12 @@ There are different options to get started: ## 🖥️ FRONTEND +-Before running Argilla frontend server, we need to install Node version 18: + +```bash +-brew install node@18 +``` +

💣 Install dependencies

```bash @@ -89,17 +97,33 @@ npm run generate ## 📏 Principles -- **Open**: Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can **use and combine your preferred libraries** without implementing any specific interface. +- **Open**: Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, + spaCy, Stanford Stanza, Flair, etc.). In fact, you can **use and combine your preferred libraries** without + implementing any specific interface. -- **End-to-end**: Most annotation tools treat data collection as a one-off activity at the beginning of each project. In real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model goes into production, you want to monitor and analyze its predictions and collect more data to improve your model over time. Argilla is designed to close this gap, enabling you to **iterate as much as you need**. +- **End-to-end**: Most annotation tools treat data collection as a one-off activity at the beginning of each project. In + real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model + goes into production, you want to monitor and analyze its predictions and collect more data to improve your model over + time. Argilla is designed to close this gap, enabling you to **iterate as much as you need**. -- **User and Developer Experience**: The key to sustainable NLP solutions are to make it easier for everyone to contribute to projects. _Domain experts_ should feel comfortable interpreting and annotating data. _Data scientists_ should feel free to experiment and iterate. _Engineers_ should feel in control of data pipelines. Argilla optimizes the experience for these core users to **make your teams more productive**. +- **User and Developer Experience**: The key to sustainable NLP solutions are to make it easier for everyone to + contribute to projects. _Domain experts_ should feel comfortable interpreting and annotating data. _Data scientists_ + should feel free to experiment and iterate. _Engineers_ should feel in control of data pipelines. Argilla optimizes + the experience for these core users to **make your teams more productive**. -- **Beyond hand-labeling**: Classical hand-labeling workflows are costly and inefficient, but having humans in the loop is essential. Easily combine hand-labeling with active learning, bulk-labeling, zero-shot models, and weak supervision in **novel** data annotation workflows\*\*. +- **Beyond hand-labeling**: Classical hand-labeling workflows are costly and inefficient, but having humans in the loop + is essential. Easily combine hand-labeling with active learning, bulk-labeling, zero-shot models, and weak supervision + in **novel** data annotation workflows\*\*. ## 🫱🏾‍🫲🏼 Contribute -We love contributors and have launched a [collaboration with JustDiggit](https://argilla.io/blog/introducing-argilla-community-growers) to hand out our very own bunds and help the re-greening of sub-Saharan Africa. To help our community with the creation of contributions, we have created our [developer](https://docs.argilla.io/en/latest/community/developer_docs.html) and [contributor](https://docs.argilla.io/en/latest/community/contributing.html) docs. Additionally, you can always [schedule a meeting](https://calendly.com/argilla-office-hours/30min) with our Developer Advocacy team so they can get you up to speed. +We love contributors and have launched +a [collaboration with JustDiggit](https://argilla.io/blog/introducing-argilla-community-growers) to hand out our very +own bunds and help the re-greening of sub-Saharan Africa. To help our community with the creation of contributions, we +have created our [developer](https://docs.argilla.io/en/latest/community/developer_docs.html) +and [contributor](https://docs.argilla.io/en/latest/community/contributing.html) docs. Additionally, you can +always [schedule a meeting](https://calendly.com/argilla-office-hours/30min) with our Developer Advocacy team so they +can get you up to speed. ## 🥇 Contributors @@ -111,4 +135,5 @@ We love contributors and have launched a [collaboration with JustDiggit](https:/ ## 🗺️ Roadmap -We continuously work on updating [our plans and our roadmap](https://github.com/orgs/argilla-io/projects/10/views/1) and we love to discuss those with our community. Feel encouraged to participate. +We continuously work on updating [our plans and our roadmap](https://github.com/orgs/argilla-io/projects/10/views/1) and +we love to discuss those with our community. Feel encouraged to participate. diff --git a/argilla-server/README.md b/argilla-server/README.md index 57d145f4b7..8701f1146d 100644 --- a/argilla-server/README.md +++ b/argilla-server/README.md @@ -32,47 +32,21 @@

-Argilla is a **collaboration platform for AI engineers and domain experts** that require **high-quality outputs, full data ownership, and overall efficiency**. +Argilla is a **collaboration platform for AI engineers and domain experts** that require **high-quality outputs, full +data ownership, and overall efficiency**. -This repository only contains developer info about the backend server. If you want to get started, we recommend taking a look at our [main repository](https://github.com/argilla-io/argilla) or our [documentation](https://docs.argilla.io/). +This repository only contains developer info about the backend server. If you want to get started, we recommend taking a +look at our [main repository](https://github.com/argilla-io/argilla) or our [documentation](https://docs.argilla.io/). -Are you a contributor or do you want to understand what is going on under the hood, please keep reading the documentation below. - -## Clone repository - -`argilla-server` is using `argilla` repository as submodule to build frontend statics so when cloning use the following command: - -```sh -git clone --recurse-submodules git@github.com:argilla-io/argilla-server.git -``` - -If you already cloned the repository without using `--recurse-submodules` you can init and update the submodules with: - -```sh -git submodule update --remote --recursive --init -``` - -> [!IMPORTANT] -> By default `argilla` submodule is using `develop` branch so the previous command will get the latest commit from that branch. - -### Specify a tag for argilla submodule - -When doing a release we should change `argilla` submodule to use an specific tag. In the following example we are setting tag `v1.22.0`: - -```sh -cd argilla -git fetch --tags -git checkout v1.22.0 -``` - -> [!NOTE] -> You should see some changes on the `argilla-server` root folder where the subproject commit is now changed to the one from the tag version. Feel free to commit these changes. +Are you a contributor or do you want to understand what is going on under the hood, please keep reading the +documentation below. ## Development environment -By default all commands executed with `pdm run` will get environment variables from `.env.dev` except command `pdm test` that will overwrite some of them using values coming from `.env.test` file. +By default all commands executed with `pdm run` will get environment variables from `.env.dev` except command `pdm test` +that will overwrite some of them using values coming from `.env.test` file. -These environment variables can be overrided if necessary so feel free to defined your own ones locally. +These environment variables can be override if necessary so feel free to defined your own ones locally. ### Run cli @@ -82,7 +56,8 @@ pdm cli ### Run database migrations -By default a SQLite located at `~/.argilla/argilla.db` will be used. You can create the database and run migrations with the following custom PDM command: +By default a SQLite located at `~/.argilla/argilla.db` will be used. You can create the database and run migrations with +the following custom PDM command: ```sh pdm migrate @@ -90,7 +65,8 @@ pdm migrate ### Run tests -A SQLite database located at `~/.argilla/argilla-test.db` will be automatically created to run tests. You can run the entire test suite using the following custom PDM command: +A SQLite database located at `~/.argilla/argilla-test.db` will be automatically created to run tests. You can run the +entire test suite using the following custom PDM command: ```sh pdm test @@ -98,21 +74,8 @@ pdm test ## Run development server -### Build frontend static files - -Before running Argilla development server we need to build the frontend static files. Node version 18 is required for this action: - -```sh -brew install node@18 -``` - -After that you can build the frontend static files: - -```sh -./scripts/build_frontend.sh -``` - -After running the previous script you should have a folder at `src/argilla_server/static` with all the frontend static files successfully generated. +Note: If you need to run the frontend server you can follow the instructions at +the [argilla-frontend](/argilla-frontend/README.md) project ### Run uvicorn development server From d6e140ab5a77a1ddf61187e62ee5448a23727b82 Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 16:19:47 +0200 Subject: [PATCH 09/11] Update argilla-frontend/README.md --- argilla-frontend/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/argilla-frontend/README.md b/argilla-frontend/README.md index f95b28006a..a1f116da30 100644 --- a/argilla-frontend/README.md +++ b/argilla-frontend/README.md @@ -67,7 +67,7 @@ There are different options to get started: ## 🖥️ FRONTEND --Before running Argilla frontend server, we need to install Node version 18: +-Before running Argilla frontend server, you need to install Node version 18: ```bash -brew install node@18 From 55627fcd9ef493de1714d8c25dd89eb2bea56277 Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 16:20:04 +0200 Subject: [PATCH 10/11] Update argilla-frontend/README.md --- argilla-frontend/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/argilla-frontend/README.md b/argilla-frontend/README.md index a1f116da30..3eb818f43c 100644 --- a/argilla-frontend/README.md +++ b/argilla-frontend/README.md @@ -67,7 +67,7 @@ There are different options to get started: ## 🖥️ FRONTEND --Before running Argilla frontend server, you need to install Node version 18: +- Before running Argilla frontend server, you need to install Node version 18: ```bash -brew install node@18 From a1b36d4b6fb350b44f0ceb9fc3af09fc0dbc1deb Mon Sep 17 00:00:00 2001 From: Francisco Aranda Date: Tue, 14 May 2024 17:52:57 +0200 Subject: [PATCH 11/11] Update argilla-frontend/README.md --- argilla-frontend/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/argilla-frontend/README.md b/argilla-frontend/README.md index 3eb818f43c..d8b4608aea 100644 --- a/argilla-frontend/README.md +++ b/argilla-frontend/README.md @@ -70,7 +70,7 @@ There are different options to get started: - Before running Argilla frontend server, you need to install Node version 18: ```bash --brew install node@18 +brew install node@18 ```

💣 Install dependencies