Sentiment Analysis Project

About datasets, please auther...

Test Description

This task is a traditional NLP task: Text Classification for Sentiment Analysis.

process raw data to extract features.
build one or more model (e.g. Logistic Regression, Decision Tree, Neural Network) to train on such data set
test trained model on some evaluation metric (e.g. Precision, Recall, AUC)

Development Environment

Depend mainly on list the tools and libraries . For details, please check requirements.txt

Project Structure

This is a detailed description of the organizational structure of the project.

1. organization structure

directory structure	description
data	data files and glove.840B.300d.txt
ipynb	notebook code
model	model files by deep learning
model_best	best model files by a method of deep learning
predict_result	prediction result for test.json
images	some pictures for data analysis
src	main code

2. code structure

src, core code	description
src/config.py	project configuration information module, mainly including file reading or storage path information
src/constant.py	constant variables or infrequently changing variables
src/util.py	data processing module, mainly including data reading and processing functions
src/model.py	deep learning model definition

3. deep learning model

src, main train and predict code	description
src/main_train_dl_...	model training module, model training process includes data processing, feature extraction, model training, model validation and other steps.
src/main_train_dl_1_rnn_simple_and_predict.py	rnn simple non glove_embedding
src/main_train_dl_2_rnn_glove_embedding_and_predict.py	rnn_glove_embedding
src/main_train_dl_3_cnn_and_predict.py	cnn_glove_embedding
src/main_train_dl_4_rcnn_glove_embedding_and_predict.py	rcnn_glove_embedding
src/main_train_dl_5_elmo_like_and_predict.py	elmo_like_glove_embedding

4. machine learning model

src, main train and predict code	description
src/main_train_ml_...	Model training module, model training process includes data processing, feature extraction, model training, model validation and other steps.
src/main_train_ml_1_lr_word_and_predict.py	logistic regression + word ngram
src/main_train_ml_2_lr_char_and_predict.py	logistic regression + char ngram
src/main_train_ml_3_lr_word_char_and_predict.py	logistic regression + word_char ngram

Instructions

Prepare pyenv & pip install -r requirement.txt
config data file storage path in config.py
run script: run.sh , The training model is saved and the (precision_score, recall_score, f1-score) of the test set will be shown by log.

Tips:

sh run.sh , only run a best model

python src/main_train_dl_4_rcnn_glove_embedding_and_predict.py -ep 10

rcnn model metrics result:

=====test run result=====:

test f1_score: 0.8123589611797799
test precision_score: 0.8125411611403617
test recall_score: 0.8123827392120075

other machine learning model, train and predict

python src/main_train_ml_1_lr_word_and_predict.py
python src/main_train_ml_2_lr_char_and_predict.py
python src/main_train_ml_3_lr_word_char_and_predict.py

other deep learning model, train and predict

when you train the following model, maybe you need to make sure that you have comment out the prediction code part and metrics code part.
When you have completed the model training, you need to open the comments (prediction code part and metrics code part) and load the best model you have created.
when you want to prediction and evaluation model, you need to make sure that you have comment out the train code part or set parameter -ep 0

python src/main_train_dl_1_rnn_simple_and_predict.py -ep 10
python src/main_train_dl_2_rnn_glove_embedding_and_predict.py -ep 10
python src/main_train_dl_3_cnn_and_predict.py  -ep 10
python src/main_train_dl_5_elmo_like_and_predict.py -ep 10

Submit Requirement

The output should include

prediction result for test.json
evaluation result for training and predicting performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis Project

Test Description

Development Environment

Project Structure

1. organization structure

2. code structure

3. deep learning model

4. machine learning model

Instructions

Submit Requirement

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
images		images
ipynb		ipynb
model		model
predict_result		predict_result
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dp.sh		dp.sh
requirements.txt		requirements.txt
run.sh		run.sh

License

blaire101/Sentiment-Analysis-Project

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis Project

Test Description

Development Environment

Project Structure

1. organization structure

2. code structure

3. deep learning model

4. machine learning model

Instructions

Submit Requirement

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages