Microservice of ChatEval to handle evaluation of neural chatbot models. Uses both word embeddings and Amazon Mechanical Turk to evaluate models.
The Evaluation microservice can be initialized by running source init.sh
to wget
the pre-trained word embeddings (configurable with an enviroment variable named EMBEDDING_FILE
) and to run the Flask server at port 8001.
To run the automatic evaluation, a POST
request must be made to /auto
containing parameters model_responses
and baseline_responses
, as equal length string lists. The response is a JSON object containing keys for the evaluation metrics and their corresponding float values.
ChatEval supports the use of Docker as both a development and deployment tool.
- Install Docker.
- Configure environment variables in
Dockerfile
by addingENV variable value
for each environment variable. - Build Docker image by using
docker build -t evaluation .
(this may take some time). - Run Evaluation on port 8001 by using
docker run evaluation
- Access app at localhost:8001.