We use poetry
. Run poetry install
at the root of the repo.
The model (*.ckpt) is tracked with Git LFS (Large File System)
The code for the evaluator can be found under covidfaq/evaluating
.
The main script is evaluator.py
. It needs to be pointed to a json file
containing the evaluation data and to know which model should be evaluated.
Optionally, it accept a config file to initialize the model. E.g.,
poetry run python covidfaq/evaluating/evaluator.py
--test-data=covidfaq/evaluating/faq_eval_data.json
--model-type=embedding_based_reranker
--config=[...]/config.yaml
poetry run python covidfaq/evaluating/evaluator.py
--test-data=covidfaq/evaluating/faq_eval_data.json
--model-type=cheating_model
To evaluate google's model, export your authentication key as an environment variable:
export GCLOUD_AUTH_TOKEN=$(gcloud auth application-default print-access-token)
then run:
poetry run python covidfaq/evaluating/evaluator.py
--test-data=covidfaq/evaluating/faq_eval_data.json
--model-type=google_model
To use the evaluator with a new model, two modifications must be done:
- create a new class (under
covidfaq/evaluating/model
) that implements the interfacecovidfaq/evaluating/model/model_evaluation_interface.py
. See the doc in this interface for more info on the two methods to implement. Note that any information you need to initialize your model must be passed in the config file that is passed with--config=..../config.yaml
to the evaluator. For example, the saved model weight location can be specified here. - Add an if in the
evaluator.py
to accept your model (and to load the proper class that you implemented in the point above).