-
Notifications
You must be signed in to change notification settings - Fork 0
A code repository for the research paper "A Review of Natural Language Processing in Contact Centre Automation"
License
SShah30-hue/sentiment-analysis-review
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Sentiment analysis neural network trained using RoBERTa, BERT, ALBERT, or DistilBERT, 2D CNN, and Wav2vec2 on MELD dataset. torch==1.3.0 pandas==0.25.0 numpy==1.17.4 transformers==3.0.1 To download data, please visit - https://affective-meld.github.io/ Note: To train or evaluate audio model, downloaded data needs to be converted from MP4 to WAV mono format. ---------------------------------------------------------------------------------------------------------------------- For TEXT input: TO TRAIN THE MODEL: python train.py --data_format text --model_name_or_path roberta-base --output_dir my_model --num_eps 2 TO EVALUATE THE MODEL YOU HAVE TRAINED: python evaluate.py --data_format text --model_name_or_path models/my_model_text TO ANALYZE THE INPUTS WITH THE MODEL YOU HAVE TRAINED python analyze.py --model_name_or_path models/my_model_text Sentiment analysis neural network trained by fine-tuning 2D CNN and Wav2vec 2.0 on the MELD datasets. ---------------------------------------------------------------------------------------------------------------------- For AUDIO input: 2D CNN TO TRAIN THE MODEL: python train.py --data_format audio_2dcnn --feature mfcc --train_size 800 --test_size 200 --num_eps 20 OR python train.py --data_format audio_2dcnn --feature mfcc --train_size 22000 --test_size 2131 --num_eps 20 {please note: normally the array_cols=641. If not, the error will display the right *array_cols* to input. Additionally, you can choose between *mfcc* or *melspec* for feature extraction} TO EVALUATE THE MODEL YOU HAVE TRAINED: python evaluate.py --data_format audio_2dcnn --feature mfcc --train_size 800 --test_size 200 WAV2VEC2 TO TRAIN THE MODEL: python train.py --data_format wav2vec2 --train_size 800 --test_size 200 TO TRAIN THE WAV2VEC2 LARGE: python train.py --data_format wav2vec2 --train_size 800 --test_size 200 --model_name_or_path facebook/wav2vec2-large-960h TO EVALUATE THE MODEL YOU HAVE TRAINED: python evaluate.py --data_format wav2vec2 --test_size 200 --audio_model checkpoint-xxx ------------------------------------------------------------------------------------------------------------------------- For BIMODAL input: Run all cells in Jupyter Notebooks RoBERTa transfotmer embeddings.ipynb and wav2vec 2.0 base embeddings.ipynb or wav2vec 2.0 large embeddings.ipynb first. The embeddings will be saved respectively and fused later. Following that, run bimodal.ipynb to fuse the embeddings, train and evalaute the model.
About
A code repository for the research paper "A Review of Natural Language Processing in Contact Centre Automation"
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published