Skip to content

A tool to convert Rasa data to other formats more friendly to tools like spacy, sklearn, transformers

License

Notifications You must be signed in to change notification settings

nsorros/rasa-converter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rasa converter 💬

Rasa converter converts Rasa training data format to a more conventient format for training outside Rasa. By default this format is a JSONL with that follows the structure below

{
    "text": ...,
    "intent": ...,
    "entities": [
        {
            "start_char": ...,
            "end_char": ...,
            "label": ...,
            "text": ...,
        },
        ...
    ]
}

🛠 Install

python -m venv venv
source venv/bin/activate

git clone https://www.github.com/nsorros/rasa-converter
cd rasa-converter
pip install ".[spacy,sklearn]"

▶️ Quickstart

Train a spacy model using Rasa data

rasa-convert PATH_TO_RASA_DATA data.spacy --format spacy
split-data data.spacy train_data.spacy test_data.spacy

spacy init config config.cfg --pipeline textcat,ner # ner if data have entities
spacy train config.cfg --paths.train data.spacy --paths.dev data.spacy

Train an sklearn model using Rasa data

rasa-convert PATH_TO_RASA_DATA data.jsonl

sklearn data.jsonl

Note that we provide a convenient cli sklearn entrypoint that trains a tfidf-svm model. You can write your own training loop to experiment with different models.

⚙️ Contribute

You need to setup a development environment with all dependencies. You also probably want to install the package in editable mode.

python -m venv venv
source venv/bin/activate

git clone https://www.github.com/nsorros/rasa-converter
cd rasa-converter
pip install -e ".[spacy,sklearn,test]"

To run tests run pytest or even better tox

About

A tool to convert Rasa data to other formats more friendly to tools like spacy, sklearn, transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages