vira-intent-classification

Scope

The purpose of this project is to train and serve a language model of intent classification for the VIRA chatbot.

Usage

This repo should be used in the following scenarios, in this order:

Adding new intents

Adding new intents is normally done on a maintainer personal computer using a CSV editor.

If the repo does not exist on your computer, clone it using the command:

git clone https://github.com/IBM/vira-intent-classification.git

To add new intents:

Make sure you are using the latest version of the repo using the command git pull
Edit the CSV files under intent_dataset using a CSV editor.
Commit your changes to the repo. It is recommended to create a Pull Request when making changes, as described under [Maintenance].

Packaging the repository for deployment

This step is required only when there are changes to the Python code. It can be executed from the computer used for adding new intents.

Pre-requisites:

If the repo does not exist on your computer, clone it as shown in the previous section.
Make sure that Docker Desktop is installed on your computer
Create a repository on the Docker hub as explained here

To package the repo for deployment:

Run: docker build . -t vira-intent-classifier TODO
Run: docker push <hub-user>/<repo-name>:vira-intent-classifier

Training a new intent classification model

In many cases, training a new model can be done on the same computer that was used for adding new intents. However, it is also possible to use a separate computer, preferably one with a GPU, for faster execution.

If this repo does not exist on the computer used for training, clone it using the command:

git clone https://github.com/IBM/vira-intent-classification.git

And in addition:

Make sure you have Python 3.7+ installed.
Open a shell and change directory to the repo root directory
Create a new Python virtual environment: python -m venv venv
Activate the virtual environment using: source venv/bin/activate
Install the dependencies using: pip install -r requirements.txt
Deactivate the virtual environment by running: deactivate
Register at 🤗 Hugging Face and obtain your authentication token from the tokens page

To train a new model:

Make sure you are using the latest version of the repo using the command git pull
Activate the virtual environment using: source venv/bin/activate
Run the trainer script python trainer.py and wait until it finishes
Upload the new model and the dataset to HuggingFace hub using the command: python upload.py <your_auth_token>.

Deploying an intent classification model

Deployment is normally done on a remote server that is publicly available on the web and supports containerized services such as Kubernetes. However, for testing purposes it is possible to deploy on a personal computer. It is recommended, but not mandatory, to use a hardware with GPU.

To deploy the model on a remote server:

Configure the platform used for containerized services to run the docker image <hub-user>/<repo-name>:vira-intent-classifier
Verify that the service is running by opening a browser at the URL https://<server-ip>/health

To deploy the model on a personal computer:

Make sure you have Docker Desktop installed
Run: docker run -p 8000:8000 <hub-user>/<repo-name>:vira-intent-classifier
Verify that the service is running by opening a browser at the URL https://127.0.0.1:8000/health

Maintenance

Pull requests are very welcome! Make sure your patches are well tested. Ideally create a topic branch for every separate change you make. For example:

Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request and ask another person to review and merge

License

All source files must include a Copyright and License header. The SPDX license header is preferred because it can be easily scanned.

If you would like to see the detailed LICENSE click here.

#
# Copyright 2020- IBM Inc. All rights reserved
# SPDX-License-Identifier: Apache2.0
#

More Information

More information can be found in these files:

LICENSE
CONTRIBUTING.md
MAINTAINERS.md
CHANGELOG.md
dco.yml - This enables DCO bot for you, please take a look https://github.com/probot/dco for more details.

Notes

If you have any questions or issues you can create a new issue here.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
intent_dataset		intent_dataset
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
README.md		README.md
assessment.py		assessment.py
consts.py		consts.py
dump.py		dump.py
logging.conf		logging.conf
requirements.txt		requirements.txt
service.py		service.py
trainer.py		trainer.py
upload.py		upload.py
utils.py		utils.py

License

IBM/vira-intent-classification

Folders and files

Latest commit

History

Repository files navigation

vira-intent-classification

Scope

Usage

Adding new intents

Packaging the repository for deployment

Training a new intent classification model

Deploying an intent classification model

Maintenance

License

More Information

Notes

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages