Skip to content

This repository contains all the code written for the final project of CSCI-GA.2590-001 Natural Language Processing 15Spring

Notifications You must be signed in to change notification settings

amallia/Automated-Essay-Grader

 
 

Repository files navigation

Automated-Essay-Grader

This repository contains all the code written for the final project of CSCI-GA.2590-001 Natural Language Processing 15Spring

Team members:

  • Yucheng Lu
  • Fangyun Sun
  • Wenying Liu

Our project is based on a kaggle competition here: https://www.kaggle.com/c/asap-aes

Run:

  • First, you need to install all the packages in the requirements.txt.
  • Then, use python run.py to run the whole program. And the program will first generate all the necessary features and store them into FeatureData directory and then add these features into training data and test data. And the program will store the training data into TrainingData directory and store test data into TestData directory. The preocess will take about 4-5 hours.
  • Finally, the program will train linear regression model, gradient boosting tree model and random forest regression model and store the results in the Result directory.

Reminder:

  • For the linear regression does not perform well in a high-dimensional data set, so we just use the basic features of the articles and the part of speech features. So we store the generated data in advance in the LinearRegressionData directory.

About

This repository contains all the code written for the final project of CSCI-GA.2590-001 Natural Language Processing 15Spring

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%