LanguagePredictionUsingDecisionTree

How to run this program

Required Installation The program used a few auxiliary libraries. These auxiliary libraries include NLTK, numpy, pandas, csv. Please pip/conda install these packages before running the program

Training Process

To train this program, you need to include 3 parameters:

examples - the training example file.
hypothesisOut - the file to write the model to - This is a name that you can make up
learning-type - which learning algorithm to train. It is either “dt” (DecisionTree) or “ada” (Adaboost)

Usage python train.py

Training files are available in the processed_data directory. Under this directory, there is a train.txt and test.txt

train.txt - For training
test.txt - For validation, checking what the accuracy the classifier achieved in this test.txt

Warning The adaboost takes a while to train as it is creating a decision stump each time. To loop through 10 stumps, it may take up to a minute.

Information For decision tree, when the training is completed, it will generate a .py file. It is creating a python if statement classifiers.

Example Run:

Training using Adaboost python train.py processed_data/train.txt adaboostclassifier.txt ada

Training using Decision Tree python train.py processed_data/train.txt decisiontreeClassifier.py dt

Prediction

To train this program, you need to include 2 parameters:

hypothesis - the best classifier generated by decision tree or adaboost
file - the test file

The prediction will print out whether it predicted it as english or dutch in the console

Usage python predict.py

Predicting Example Run: python predict.py dt processed_data/test1_prof.dat

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
processed_data		processed_data
Lab2_Writeup.pdf		Lab2_Writeup.pdf
README.md		README.md
README.txt.rtf		README.txt.rtf
adaBoostTrain.py		adaBoostTrain.py
data_collection.py		data_collection.py
decisionTreeTrain.py		decisionTreeTrain.py
decisiontreeClassifier.py		decisiontreeClassifier.py
features.py		features.py
predict.py		predict.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

processed_data

processed_data

Lab2_Writeup.pdf

Lab2_Writeup.pdf

README.md

README.md

README.txt.rtf

README.txt.rtf

adaBoostTrain.py

adaBoostTrain.py

data_collection.py

data_collection.py

decisionTreeTrain.py

decisionTreeTrain.py

decisiontreeClassifier.py

decisiontreeClassifier.py

features.py

features.py

predict.py

predict.py

train.py

train.py

Repository files navigation

LanguagePredictionUsingDecisionTree

About

Releases

Packages

Languages

JLiu1272/LanguagePredictionUsingDecisionTree

Folders and files

Latest commit

History

Repository files navigation

LanguagePredictionUsingDecisionTree

About

Resources

Stars

Watchers

Forks

Languages