GitHub - Stuyxr/Sentiment-Analysis: 基于LSTM的中文短文本情感分析

中文词向量模型

https://github.com/Embedding/Chinese-Word-Vectors

sgns.zhihu.bigram

运行环境

pytorch1.0.1

训练模型

python sentiment_analysis.py

测试模型

python test.py

说明

“train_data”目录下提供10000条训练数据，包括5000条积极情感文本（sample.positive.txt）和5000条消极情感文本（sample.negative.txt）；

文件为“UTF-8”编码，数据以xml格式存储，格式如下：

<review id="n">
xxx
</review>

每个“review”标签是一条训练数据，“id”是训练数据编号（0到9999），标签内容“xxx”为文本内容。 “test_data”目录下是文件“test.txt”，包含2500条未知类别（积极或消极）的测试数据，使用学习的系统对其进行预测。文件为“UTF-8”编码，数据以xml格式存储，格式如下：

<review id="n">
xxx
</review>

每个“review”标签是一条测试数据，“id”是测试数据编号（0到2499），标签内容“xxx”为文本内容。对测试数据进行预测，积极用“1”表示；消极用“0”表示。

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
test_data		test_data
train_data		train_data
.gitignore		.gitignore
1170300418.csv		1170300418.csv
README.md		README.md
cut.txt		cut.txt
label_data.pt		label_data.pt
load_data.py		load_data.py
readme-情感分类系统说明.txt		readme-情感分类系统说明.txt
sentiment_analysis.py		sentiment_analysis.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

中文词向量模型

运行环境

训练模型

测试模型

说明

About

Releases

Packages

Languages

Stuyxr/Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

中文词向量模型

运行环境

训练模型

测试模型

说明

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages