Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/source/tutorial/zh/pretrain/start.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
训练模型
---------
------------

如需训练模型则可直接train_vector函数接口,来使使训练模型更加方便。模块调用gensim库中的相关训练模型,目前提供了"sg"、 "cbow"、 "fastext"、 "d2v"、 "bow"、 "tfidf"的训练方法,并提供了embedding_dim参数,使之可以按照需求确定向量的维度。

基本步骤
##################
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorial/zh/seg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
主要处理内容
--------------------

1.将字典输入形式的选择题通过语义成分分解转换为符合条件的item
1.将字典输入形式的选择题通过 `语法解析 <parse.rst>`_ 转换为符合条件的item

2.将输入的item按照元素类型进行切分、分组。

Expand Down
6 changes: 3 additions & 3 deletions docs/source/tutorial/zh/vectorization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
总体流程
---------------------------

1.对传入的item进行解析,得到SIF格式;
1.对传入的item进行 `语法解析 <parse.rst>`_ ,得到SIF格式;

2.对sif_item进行成分分解
2.对sif_item进行 `成分分解 <seg.rst>`_

3.对经过成分分解的item进行令牌化
3.对经过成分分解的item进行 `令牌化 <tokenize.rst>`_

4.使用已有或者使用提供的预训练模型,将令牌化后的item转换为向量。

Expand Down