Skip to content

Code for our IEEE ICWS2021 short paper "Efficient Grammatical Error Correction with Hierarchical Error Detections and Correction".

License

Notifications You must be signed in to change notification settings

AnticPan/Hierarchical-GEC

Repository files navigation

Hierarchical-GEC

Code for our IEEE ICWS2021 short paper "Efficient Grammatical Error Correction with Hierarchical Error Detections and Correction".

Installation

Clone this repository and enter.

Create a python 3.7 virtual environment and run the following command:

pip install -r requirements.txt

Datasets and Trained Model

All datasets used in the paper can be found here. The M2 format file should be converted to tsv format file with source sentence and target sentences pairs per line, which can be done by using utils/m2_to_tsv.py.

Our trained model can be downloaded here.

Train Model

  1. Download BERT or SpanBERT from here.
  2. Prepare train and dev datasets.
  3. Run the following command:
python train.py --bert_dir [BERT_DIR] \
                --train_file [TRAIN_FILE/DIR] \
                --valid_file [DEV_FILE/DIR] \
                --output_dir [OUTPUT_DIR] \
                --gpus 0 \
                --truncate 50 \
                --epoch 3 \
                --batch_size 128 \
                --lr 3e-5
  1. The trained model is in [OUTPUT_DIR]/model/epoch-[3]

Predict

Choose Threshold

The default threshold is 0.5, you can find a better one by grid search in the development set.

  1. Set the model_dir and valid_file in grid.sh
  2. Run bash grid.sh

Predict File

python predict.py --model_dir [TRAINED_MODEL_DIR] \
                  --output_dir [OUTPUT_DIR] \
                  --test_file [TEST_FILE] \
                  --discriminating_threshold [0.5] \
                  --batch_size 16 \
                  --gpu 0

Use gRPC

  1. Start the gRPC server with command:
python grpc_server.py --model_dir [TRAINED_MODEL_DIR]
  1. Call the api like grpc_client.py

About

Code for our IEEE ICWS2021 short paper "Efficient Grammatical Error Correction with Hierarchical Error Detections and Correction".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages