NLP_Course_Project

新闻文本分类及推荐系统

基于已有语料库的词频统计与词云可视化：|||||||||| Function FINISHED
- 技术栈：NLP库（NLTK），词云生成库（WordCloud），数据可视化库（如Matplotlib、Seaborn）
文本分类：基于BERT、LSTM、MultiHeadAttention： |||||||||| Training Finished
- 技术栈：深度学习框架（PyTorch），预训练模型（BERT），文本处理工具（transformers库）
文本推荐：基于词向量匹配：
- 技术栈：词向量模型（GloVe），推荐算法（基于余弦相似度）
命名实体识别及其可视化：基于HanLP：
- 技术栈：HanLP工具包，命名实体识别算法
文本聚类及其可视化：
- 技术栈：文本聚类算法（K-means），数据可视化库（Matplotlib、Seaborn、Plotly），文本处理工具（NLTK）

bert_naive_20240501174421.pth nlp_course_project\pretrained_weight\bert-base-chinese\pytorch_model.bin

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
config		config
corpus		corpus
dev		dev
model		model
pretrained_weight/bert-base-chinese		pretrained_weight/bert-base-chinese
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
class_indices.json		class_indices.json
requirements.txt		requirements.txt