GitHub - gabehesse/question-classification: question classification for the dataset - [ http://cogcomp.org/Data/QA/QC/ ]. Understanding and categorising the questions is very useful while building a chat bot or a QA bot.

question-classification

Classifier for the question classification dataset - [ http://cogcomp.org/Data/QA/QC/ ]

Results from the empirical tests carried out are in {project_directory}/documentation/Results.md

Execution

Go to the project directory.
We need to execute the command ./bin/qc.sh nlp first.
Once the Natural Language Processing (NLP) is done for computing annotated natural language property we can train one of the models.
To train a model run command ./bin/qc.sh train {ml_algo_model}. e.g ./bin/qc.sh train svm
To test a model run command ./bin/qc.sh test {ml_algo_model}.

Machine learning algorithms implemented - {ml_algo_model}

svm = Support Vector Machine
lr = Logistic Regression
linear_svm = Linear Support Vector Classifier (Machine)

Experimental Code

The method to convert text data to ML features can be modified in function qc.dataprep.text_features.get_vect.
The feature stack (what all data is to be feed to ML algorithm) can be modified/transformed/generated in file qc.dataprep.feature_stack

These (point 1, 2) changes are used whenever you execute training process again. There is no need to execute nlp step again.
Machine learning algorithms can be added in function qc.ml.train.train_one_node. (Parameter tuning too can be done) e.g In the experimental part of the code add extra elif statement
```
elif == {your_model_name}:
    machine = {Initialize the algorithm you want to use}
```
While executing using shell script execute command ./bin/qc.sh train {your_model_name}, and this command will use the model defined by you.

Dependencies used

python - v3.6.3
configobj - v5.0.6
spaCy - v2.0.9 (with "en_core_web_lg" english model)
sner - v0.2.3
scipy - v1.0.0
scikit-learn - v0.19.1

Credits

This project has been inspired from one of the problem we tried to solve - understanding the question for our QA bot. In the project I did work with Akash Pateria - [https://github.com/Akash-Pateria], we worked together in the final year graduate project, named Invoker.

This project aims at exploring more options to process Natural Language (English) and improve the accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
bin		bin
dataset		dataset
documentation		documentation
qc		qc
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bin

bin

dataset

dataset

documentation

documentation

qc

qc

resources

resources

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

question-classification

Execution

Machine learning algorithms implemented - {ml_algo_model}

Experimental Code

Dependencies used

Credits

NOTE:

1. Tab = 4 spaces

2. command `python` should point to the installation following the above mentioned dependencies

3. Or you can change the command in the shell script `qc.sh` to the suitable python command.

python -m {operation} -> python3 -m {operation}

About

Releases

Packages

Languages

License

gabehesse/question-classification

Folders and files

Latest commit

History

Repository files navigation

question-classification

Execution

Machine learning algorithms implemented - {ml_algo_model}

Experimental Code

Dependencies used

Credits

NOTE:

1. Tab = 4 spaces

2. command python should point to the installation following the above mentioned dependencies

3. Or you can change the command in the shell script qc.sh to the suitable python command.

python -m {operation} -> python3 -m {operation}

About

Resources

License

Stars

Watchers

Forks

Languages

2. command `python` should point to the installation following the above mentioned dependencies

3. Or you can change the command in the shell script `qc.sh` to the suitable python command.