vira-dialog-act-classification

Scope

The purpose of this project is to train and serve a language model of dialog-act classification for the VIRA chatbot.

Usage

This repo should be used in the following scenarios, in this order:

Adding new dialog-acts

Adding new dialog-acts is normally done on a maintainer personal computer using a CSV editor.

If the repo does not exist on your computer, clone it using the command:

git clone https://github.com/IBM/vira-dialog-act-classification.git

To add new dialog-acts:

Make sure you are using the latest version of the repo using the command git pull
Edit the CSV files under dialog-act_dataset using a CSV editor.
Commit your changes to the repo. It is recommended to create a Pull Request when making changes, as described under [Maintenance].

Packaging the repository for deployment

This step is required only when there are changes to the Python code. It can be executed from the computer used for adding new dialog-acts.

Pre-requisites:

If the repo does not exist on your computer, clone it as shown in the previous section.
Make sure that Docker Desktop is installed on your computer
Create a repository on the Docker hub as explained here

To package the repo for deployment:

Run: docker build . -t vira-dialog-act-classifier TODO
Run: docker push <hub-user>/<repo-name>:vira-dialog-act-classifier

Training a new dialog-act classification model

In many cases, training a new model can be done on the same computer that was used for adding new dialog-acts. However, it is also possible to use a separate computer, preferably one with a GPU, for faster execution.

If this repo does not exist on the computer used for training, clone it using the command:

git clone https://github.com/IBM/vira-dialog-act-classification.git

And in addition:

Make sure you have Python 3.7+ installed.
Open a shell and change directory to the repo root directory
Create a new Python virtual environment: python -m venv venv
Activate the virtual environment using: source venv/bin/activate
Install the dependencies using: pip install -r requirements.txt
Deactivate the virtual environment by running: deactivate
Register at 🤗 Hugging Face and obtain your authentication token from the tokens page

To train a new model:

Make sure you are using the latest version of the repo using the command git pull
Activate the virtual environment using: source venv/bin/activate
Run the trainer script python trainer.py and wait until it finishes
Upload the new model and the dataset to HuggingFace hub using the command: python upload.py <your_auth_token>.

Deploying a dialog-act classification model

Deployment is normally done on a remote server that is publicly available on the web and supports containerized services such as Kubernetes. However, for testing purposes it is possible to deploy on a personal computer. It is recommended, but not mandatory, to use a hardware with GPU.

To deploy the model on a remote server:

Configure the platform used for containerized services to run the docker image <hub-user>/<repo-name>:vira-dialog-act-classifier
Verify that the service is running by opening a browser at the URL https://<server-ip>/health

To deploy the model on a personal computer:

Make sure you have Docker Desktop installed
Run: docker run -p 8000:8000 <hub-user>/<repo-name>:vira-dialog-act-classifier
Verify that the service is running by opening a browser at the URL https://127.0.0.1:8000/health

Maintenance

Pull requests are very welcome! Make sure your patches are well tested. Ideally create a topic branch for every separate change you make. For example:

Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request and ask another person to review and merge

License

All source files must include a Copyright and License header. The SPDX license header is preferred because it can be easily scanned.

If you would like to see the detailed LICENSE click here.

#
# Copyright 2020- IBM Inc. All rights reserved
# SPDX-License-Identifier: Apache2.0
#

More Information

More information can be found in these files:

LICENSE
CONTRIBUTING.md
MAINTAINERS.md
CHANGELOG.md
dco.yml - This enables DCO bot for you, please take a look https://github.com/probot/dco for more details.

Notes

If you have any questions or issues you can create a new issue here.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
dialog_act_dataset		dialog_act_dataset
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
README.md		README.md
assessment.py		assessment.py
consts.py		consts.py
logging.conf		logging.conf
requirements.txt		requirements.txt
service.py		service.py
trainer.py		trainer.py
upload.py		upload.py
utils.py		utils.py

License

IBM/vira-dialog-act-classification

Folders and files

Latest commit

History

Repository files navigation

vira-dialog-act-classification

Scope

Usage

Adding new dialog-acts

Packaging the repository for deployment

Training a new dialog-act classification model

Deploying a dialog-act classification model

Maintenance

License

More Information

Notes

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages