This repository contains our code for the course project of the ME781 - Engineering Data Mining and Applications course conducted at IIT Bombay.
pip install numpy pandas scikit-learn tqdm nltk argparse jupyter
Download the dataset from Kaggle and place the extracted csv in a data/
directory in the root of the repository.
Run the EDA_and_Preprocessing.ipynb
notebook to view some EDA results and to pre-process the data. The preprocessed train and test splits of the data are also saved in data/
directory.
python train_data_gen.py
python train.py --model [logistic/SVM/MLP]
python eval.py --model [logistic/SVM/MLP] --testsize 100
Note Further details about implementation and evaluation can be found in the report and the code documentation.