基于神经网络的文本分类

🌐 介绍

一个简单的科研训练项目，基于神经网络的文本分类，为啥不做CV，还不是因为对NLP感兴趣~~(bushi)~~

📥 安装

git clone https://github.com/Lan-ce-lot/pythorch-text-classification.git

🛠 使用

# conda (recommended) to create a new conda env
conda env create -f environment.yaml
# or
conda install --yes --file requirements.txt
# pip
pip install -r requirements.txt

python run.py --model bert

🌏 环境

python 3.8

pytorch 1.3.1

💾 数据集

爬取自豆瓣短评豆瓣改版后加了很反爬机制，爬多了会封ip封号，解决办法：

代理ip(免费不能用，要钱买不起)

随机时间(>=5s)+随机User-Agent

🚙 模型

BERT(Bidirectional Encoder Representations from Transformers) ✅
ERNIE(Enhanced Representation through kNowledge IntEgration) ✅
RNN(Recurrent Neural Network) 🤡
CNN(Convolutional Neural Network) 🤡

📊 结果

集成了tensorboard，可以直接在终端查看训练过程

tensorboard --logdir=./data/log/textRNN

BiLSTM和BERT在训练集上的准确率对比 BiLSTM和BERT在训练集上的loss对比

模型	训练集损失率	训练集准确率	测试集损失率	测试集准确率
BiLSTM	0.29	0. 93	0.32	0.87
BERT	0.03	0. 98	0.21	0.92

模型	评论类别	准确率	召回率	f1-score	评论数量
BiLSTM	好评	0.8899	0.9238	0.9065	3779
	差评	0.8216	0.7543	0.7865	1758
BERT	好评	0.9332	0.9619	0.9474	3779
	差评	0.9123	0.8521	0.8812	1758

📈 进度

📦 依赖

程序

采用python的pythonQt编写，设计的两个按钮一个是提交，一个是清空，中间的文本框可用输入文字，左侧会显示情感分析结果，判断积极消极的情感。该程序布局如下图

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github		.github
bert_pretrain		bert_pretrain
img		img
models		models
pytorch_pretrained		pytorch_pretrained
tensorboard-X		tensorboard-X
test		test
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
MY_UI.py		MY_UI.py
README.md		README.md
RNN_exporter.py		RNN_exporter.py
SECURITY.md		SECURITY.md
bert_exporter.py		bert_exporter.py
data_prepro.py		data_prepro.py
environment.yaml		environment.yaml
my_run.py		my_run.py
my_train_eval.py		my_train_eval.py
my_utils.py		my_utils.py
my_utils_fasttext.py		my_utils_fasttext.py
pylintrc		pylintrc
requirements.txt		requirements.txt
run.py		run.py
testBert.py		testBert.py
train_eval.py		train_eval.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

基于神经网络的文本分类

🌐 介绍

📥 安装

🛠 使用

🌏 环境

💾 数据集

🚙 模型

📊 结果

📈 进度

📦 依赖

程序

📚 参考

📝 License

About

Releases

Packages

Contributors 3

Languages

License

Lan-ce-lot/pythorch-text-classification

Folders and files

Latest commit

History

Repository files navigation

基于神经网络的文本分类

🌐 介绍

📥 安装

🛠 使用

🌏 环境

💾 数据集

🚙 模型

📊 结果

📈 进度

📦 依赖

程序

📚 参考

📝 License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages