Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

WH Question Analysis

whAnalysis is a project for tagging clauseType and questType to analyze for frequencies.

The linguistic work which brought this project to life was presented at XPRAG on June 13th, 2019.

See more about our motivations behind this project here



To use our tagger, we recommend you install simply using git clone

git clone


To run our tagger, run the command below. It will create a new file in your current directory with the tagged json.

python relative/dir/to/data.json

Data format

We've included a few really easy functions to convert corpora to our data format in the corpus_handlers/ directory. The best example is our bnc handler.

The data must be in a .json file. The file must be a list of JSON objects which must include a new object for each sentence and a key "sentence" with the value of the sentence included. The JSON objects must be in a list for the tagger to work.

        "sentence": "Why did I need to include this json as a part of my readme?"
        "sentence": "It's helpful to have examples to follow!"


This project is designed and developed entirely in python with the use of

  • Python - Python Version 3.5 or greater
  • NLTK - The Natural Language Toolkit
  • BeautifulSoup4 - Python scraping and parse tree processing library
  • JSON - Python json library
  • Multiprocessing - Python multiprocessing library

Authors and acknowledgment


GNU General Public License v3.0

See COPYING for the full text

You can’t perform that action at this time.