Syn-QG is a Java-based question generator which generates questions from multiple sources:
- Dependency Parsing
- Semantic Role Labeling
- NER templates
- VerbNet Predicates
- PropBank Roleset Argument Descriptions
- Custom Rules
- Implication Rules
And then performs back-translation in the end using a en-de and de-en model.
This repo is under construction but can be used to generate good QA pairs.
This repo contains the following code snippets:
-
SRL, Dependency and NER Templates
-
Lauri's Implications (Simple, Phrasal verbs and verb-noun collocations)
-
Java interface for PPDB
-
Java interface for Google N-gram
-
BackTranslation Services (en-de and de-en)
-
Language Modelling service with AllenNLP backend
-
"Won't" to "Will not" Map
The repo does not include VerbNet and PropBank templates.
In order to get SynQG up and running, perform the following steps
- Ensure that you have the Parsing Server installed and running. (The Parsing Server is a modified version of the flask server of AllenNLP so it not only contains AllenNLP models but other services like wordNet hypernym extraction, verb to noun converter, etc.)
Conda can be used set up a virtual environment with the version of Python required for AllenNLP.
-
Create a Conda environment with Python 3.6
conda create -n synqg python=3.6
-
Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use AllenNLP.
conda activate synqg
-
Run pip install . from the root of source folder.
To run the module, you need to run the following three steps:
-
Start the Parsing server
python allennlp/service/server_simple.py
-
Start the Back Translation server
python backtranslation/back_translation_server.py
-
To debug how the questions are generated in a console like view, run QuestionGenerationConsoleService.java OR Run the below Java snippet:
import net.synqg.qg.service.GeneratedQuestion; import net.synqg.qg.service.SynQGService; import java.util.Collections; import java.util.List; public class SynQgClient { public static void main(String[] args) { SynQGService synQGService = new SynQGService(); String input = "John failed to kill Mary."; List<GeneratedQuestion> questions = synQGService.generateQuestionAnswers(Collections.singletonList(input)); for (GeneratedQuestion generatedQuestion : questions) { String outline = ""; outline = outline + generatedQuestion.question() + "\t"; outline = outline + generatedQuestion.shortAnswer() + "\t"; outline = outline + generatedQuestion.templateName() + "\t"; outline = outline + input; System.out.println(outline); } } }
With the QuestionGenerationConsoleService, you should be able to check all the generated questions.
If you would like to debug and understand how a question was constructed, which templates/rules were used for construction or what is the corresponding short answer, you can set the the "printLog" variable to true.
private static boolean printLogs = true;
@inproceedings{dhole-manning-2020-syn,
title = "Syn-{QG}: Syntactic and Shallow Semantic Rules for Question Generation",
author = "Dhole, Kaustubh and
Manning, Christopher D.",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.69",
pages = "752--765",
}