Skip to content

Commit

Permalink
Merge branch 'develop' into docs/final-docs-1.x-revision
Browse files Browse the repository at this point in the history
# Conflicts:
#	README.md
  • Loading branch information
davidberenstein1957 committed Nov 3, 2022
2 parents 4835ab8 + f9988a7 commit 3407082
Show file tree
Hide file tree
Showing 6 changed files with 39 additions and 31 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Expand Up @@ -20,7 +20,7 @@ repos:
# - --remove-header

- repo: https://github.com/psf/black
rev: 22.8.0
rev: 22.10.0
hooks:
- id: black
additional_dependencies: ['click==8.0.4']
Expand Down
42 changes: 22 additions & 20 deletions README.md
Expand Up @@ -6,33 +6,36 @@
<br>
</h1>
<p align="center">
<a href="https://pypi.org/project/rubrix/">
<a href="https://pypi.org/project/argilla/">
<img alt="CI" src="https://img.shields.io/pypi/v/rubrix.svg?style=flat-square&logo=pypi&logoColor=white">
</a>
<!--a href="https://anaconda.org/conda-forge/rubrix">
<img alt="CI" src="https://img.shields.io/conda/vn/conda-forge/rubrix?logo=anaconda&style=flat&color=orange">
</!a-->
<img alt="Codecov" src="https://img.shields.io/codecov/c/github/recognai/rubrix">
<a href="https://pepy.tech/project/rubrix">
<img alt="CI" src="https://static.pepy.tech/personalized-badge/rubrix?period=month&units=international_system&left_color=grey&right_color=blue&left_text=pypi%20downloads/month">
<a href="https://pepy.tech/project/argilla">
<img alt="CI" src="https://static.pepy.tech/personalized-badge/argilla?period=month&units=international_system&left_color=grey&right_color=blue&left_text=pypi%20downloads/month">
</a>
</p>

<h2 align="center">Open-source framework for data-centric NLP</h2>
<p align="center">Data Labeling + Data Curation + Inference Store</p>
<p align="center">Designed for MLOps & Feedback Loops</p>

<iframe width="100%" height="450" src="https://www.youtube.com/embed/jP3anvp7Rto" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

https://user-images.githubusercontent.com/1107111/197567844-4370487d-fe44-441e-9a92-48e529713a15.mp4

<br>

<p align="center">
<a href="https://join.slack.com/t/rubrixworkspace/shared_invite/zt-whigkyjn-a3IUJLD7gDbTZ0rKlvcJ5g">
<img src="https://img.shields.io/badge/JOIN US ON SLACK-4A154B?style=for-the-badge&logo=slack&logoColor=white" />
</a>
<a href="https://linkedin.com/company/argilla-io">
<img src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white" />
</a>
<a href="https://twitter.com/argilla_io">
<img src="https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white" />
</a>
</p>

<br>
Expand All @@ -52,25 +55,25 @@

### Advanced NLP labeling

- Programmatic labeling using Weak Supervision. Built-in label models (Snorkel, Flyingsquid)
- Bulk-labeling and search-driven annotation
- Iterate on training data with any pre-trained model or library
- Programmatic labeling using [weak supervision](https://docs.argilla.io/en/latest/guides/techniques/weak_supervision.html). Built-in label models (Snorkel, Flyingsquid)
- [Bulk-labeling](https://docs.argilla.io/en/latest/reference/webapp/features.html#bulk-annotate) and [search-driven annotation](https://docs.argilla.io/en/latest/guides/features/queries.html)
- Iterate on training data with any [pre-trained model](https://docs.argilla.io/en/latest/tutorials/libraries/huggingface.html) or [library](https://docs.argilla.io/en/latest/tutorials/libraries/libraries.html)
- Efficiently review and refine annotations in the UI and with Python
- Use Argilla built-in metrics and methods for finding label and data errors (e.g., cleanlab)
- Simple integration with active learning workflows
- Use Argilla built-in metrics and methods for [finding label and data errors (e.g., cleanlab)](https://docs.argilla.io/en/latest/tutorials/notebooks/monitoring-textclassification-cleanlab-explainability.html)
- Simple integration with [active learning workflows](https://docs.argilla.io/en/latest/tutorials/techniques/active_learning.html)

### Monitoring

- Close the gap between production data and data collection activities
- Auto-monitoring for major NLP libraries and pipelines (spaCy, Hugging Face, FlairNLP)
- ASGI middleware for HTTP endpoints
- Argilla Metrics to understand data and model issues, like entity consistency for NER models
- [Auto-monitoring](https://docs.argilla.io/en/latest/guides/steps/3_deploying.html) for [major NLP libraries and pipelines](https://docs.argilla.io/en/latest/tutorials/libraries/libraries.html) (spaCy, Hugging Face, FlairNLP)
- [ASGI middleware](https://docs.argilla.io/en/latest/tutorials/notebooks/deploying-texttokenclassification-fastapi.html) for HTTP endpoints
- Argilla Metrics to understand data and model issues, [like entity consistency for NER models](https://docs.argilla.io/en/latest/guides/steps/4_monitoring.html)
- Integrated with Kibana for custom dashboards

### Team workspaces

- Bring different users and roles into the NLP data and model lifecycles
- Organize data collection, review and monitoring into different workspaces
- Organize data collection, review and monitoring into different [workspaces](https://docs.argilla.io/en/latest/getting_started/installation/user_management.html#workspace)
- Manage workspace access for different users

## Quickstart
Expand All @@ -92,7 +95,7 @@ The simplest way is to use`Docker` by running:
docker run -d --name es-for-argilla -p 9200:9200 -p 9300:9300 -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2

```
> :information_source: **Check [the docs](https://rubrix.readthedocs.io/en/stable/getting_started/setup%26installation.html) for further options and configurations for Elasticsearch.**
> :information_source: **Check [the docs](https://docs.argilla.io/en/latest/getting_started/quickstart.html) for further options and configurations for Elasticsearch.**
Finally you can **launch the server**:

Expand Down Expand Up @@ -181,7 +184,6 @@ Argilla is useful if you want to:
### What do I need to start using Argilla?
You need to have a running instance of Elasticsearch and install the Argilla Python library.
The library is used to read and write data into Argilla.
To get started we highly recommend using Jupyter Notebooks so you might want to install Jupyter Lab or use Jupiter support for VS Code for example.

### How can I "upload" data into Argilla?
Currently, the only way to upload data into Argilla is by using the Python library.
Expand Down Expand Up @@ -213,10 +215,10 @@ The training datasets created with Argilla are model agnostic.

You can choose one of many amazing frameworks to train your model, like [transformers](https://huggingface.co/docs/transformers/), [spaCy](https://spacy.io/), [flair](https://github.com/flairNLP/flair) or [sklearn](https://scikit-learn.org).

Check out our [cookbook](https://rubrix.readthedocs.io/en/stable/guides/cookbook.html) and our [tutorials](https://rubrix.readthedocs.io/en/stable) on how Argilla integrates with these frameworks.
Check out our [deep dives](https://docs.argilla.io/en/latest/guides/guides.html) and our [tutorials](https://docs.argilla.io/en/latest/tutorials/tutorials.html) on how Argilla integrates with these frameworks.


If you want to train a Hugging Face transformer or spaCy NER model, we provide a neat shortcut to [prepare your dataset for training](https://rubrix.readthedocs.io/en/stable/reference/python/python_client.html#rubrix.client.datasets.DatasetForTextClassification.prepare_for_training).
If you want to train a Hugging Face transformer or spaCy NER model, we provide a neat shortcut to [prepare your dataset for training](https://docs.argilla.io/en/latest/guides/features/datasets.html#Prepare-dataset-for-training).
### Can Argilla share the Elasticsearch Instance/cluster?
Yes, you can use the same Elasticsearch instance/cluster for Argilla and other applications.
You only need to perform some configuration, check the Advanced installation guide in the docs.
Expand All @@ -233,8 +235,8 @@ curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: applicat

```
## Contributors
<a href="https://github.com/recognai/rubrix/graphs/contributors">
<a href="https://github.com/argilla-io/argilla/graphs/contributors">

<img src="https://contrib.rocks/image?repo=recognai/rubrix" />
<img src="https://contrib.rocks/image?repo=argilla-io/argilla" />

</a>
4 changes: 2 additions & 2 deletions docker-compose.yaml
Expand Up @@ -2,7 +2,7 @@ version: "3"

services:
argilla:
image: argilla-io/argilla-server:latest
image: argilla/argilla-server:latest
restart: unless-stopped
ports:
- "6900:80"
Expand Down Expand Up @@ -48,4 +48,4 @@ networks:
driver: bridge

volumes:
elasticdata:
elasticdata:
16 changes: 11 additions & 5 deletions frontend/components/commons/datasets-list/DatasetsEmpty.vue
Expand Up @@ -29,12 +29,18 @@ export default {
methods: {
generateCodeSnippet() {
return `import argilla as rg
return `# install datasets library with pip install datasets
import argilla as rg
from datasets import load_dataset
rg.log(
y rg.TextClassificationRecord(text="my first cool example"),
name='example-dataset'
)`;
# load dataset from the hub
dataset = load_dataset("argilla/gutenberg_spacy-ner", split="train")
# read in dataset, assuming its a dataset for token classification
dataset_rg = rg.read_datasets(dataset, task="TokenClassification")
# log the dataset
rg.log(dataset_rg, "gutenberg_spacy-ner")`;
},
},
};
Expand Down
4 changes: 2 additions & 2 deletions frontend/package.json
@@ -1,6 +1,6 @@
{
"name": "argilla",
"version": "1.0.0",
"version": "1.1.0-dev0",
"private": true,
"eslintIgnore": [
"node_modules/**/*",
Expand Down Expand Up @@ -40,7 +40,7 @@
"vue-moment": "^4.1.0",
"vue-svgicon": "^3.2.9",
"vue-vega": "^1.0.0-alpha.13",
"vue-virtual-scroller": "^1.0.10"
"vue-virtual-scroller": "~1.0.10"
},
"devDependencies": {
"@babel/core": "7.17.2",
Expand Down
2 changes: 1 addition & 1 deletion src/argilla/_version.py
Expand Up @@ -13,4 +13,4 @@
# limitations under the License.

# coding: utf-8
version = "1.0.0"
version = "1.1.0-dev0"

0 comments on commit 3407082

Please sign in to comment.