Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions docs/source/tutorial/zh/index.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
入门
=====

.. toctree::
:maxdepth: 1
:titlesonly:

sif
seg
parse
tokenize
vectorization
* `标准项目格式 <sif.rst>`_

* `语法解析 <seg.rst>`_

* `成分分解 <parse.rst>`_

* `令牌化 <tokenize.rst>`_

* `向量化 <vectorization.rst>`_

* `预训练 <pretrain.rst>`_

示例
--------
Expand Down
14 changes: 1 addition & 13 deletions docs/source/tutorial/zh/pretrain/pub.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,26 +19,14 @@
* 不使用第三方初始化词表
* 使用第三方初始化词表



模型命名规则:一级版本 + 二级版本 + gensim_luna_stem + 分词规则 + 模型方法 + 维度

Examples:

::

全量版本-全学科的D2V模型路径:
`/share/qlh/d2v_model/luna_pub/luna_pub_all_gensim_luna_stem_general_d2v_256.bin`
(备注:一个D2V模型含4个bin后缀的文件)

模型训练数据说明
##################

* 当前【词向量w2v】【句向量d2v】模型所用的数据均为 【高中学段】 的题目
* 测试数据:`[OpenLUNA.json] <http://base.ustc.edu.cn/data/OpenLUNA/OpenLUNA.json>`_

当前提供以下模型,更多分学科、分题型模型正在训练中,敬请期待
"d2v_all_256"(全科),"d2v_sci_256"(理科),"d2v_eng_256"(文科),"d2v_lit_256"(英语)
"d2v_all_256"(全科),"d2v_sci_256"(理科),"d2v_eng_256"(英语),"d2v_lit_256"(文科)

模型训练案例
------------
Expand Down