Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
corpus_handlers
.gitignore
COPYING
INTRO.md
README.md
helper.py
tagger.py

README.md

WH Question Analysis

whAnalysis is a project for tagging clauseType and questType to analyze for frequencies.

The linguistic work which brought this project to life was presented at XPRAG on June 13th, 2019.

See more about our motivations behind this project here

Usage

Installation

To use our tagger, we recommend you install simply using git clone

git clone https://github.com/rangat/whAnalysis.git

Run

To run our tagger, run the command below. It will create a new file in your current directory with the tagged json.

python tagger.py relative/dir/to/data.json

Data format

We've included a few really easy functions to convert corpora to our data format in the corpus_handlers/ directory. The best example is our bnc handler.

The data must be in a .json file. The file must be a list of JSON objects which must include a new object for each sentence and a key "sentence" with the value of the sentence included. The JSON objects must be in a list for the tagger to work.

[
    {
        "sentence": "Why did I need to include this json as a part of my readme?"
    },
    {
        "sentence": "It's helpful to have examples to follow!"
    }
]

Dependencies

This project is designed and developed entirely in python with the use of

  • Python - Python Version 3.5 or greater
  • NLTK - The Natural Language Toolkit
  • BeautifulSoup4 - Python scraping and parse tree processing library
  • JSON - Python json library
  • Multiprocessing - Python multiprocessing library

Authors and acknowledgment

License

GNU General Public License v3.0

See COPYING for the full text

You can’t perform that action at this time.