Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev0.4.5 #193

Merged
merged 277 commits into from
Jul 12, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
277 commits
Select commit Hold shift + click to select a range
74925f8
wrong folder
violetyao Jun 9, 2019
6a9ebd5
Merge branch 'dev0.5.0' of https://github.com/violetyao/fastNLP into …
violetyao Jun 9, 2019
83729df
moved test to reproduction folder
violetyao Jun 9, 2019
bc31b28
Merge pull request #163 from violetyao/dev0.5.0
xuyige Jun 9, 2019
a504c41
Merge pull request #2 from fastnlp/dev0.5.0
lyhuang18 Jun 10, 2019
37c50d6
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jun 11, 2019
6309eaf
1. 在fieldarray中支持split,int等handy的function
yhcc Jun 12, 2019
ed3098e
1. 修改bert,elmo的cache方式; 这样不需要使用sentence_index这种方式进行索引
yhcc Jun 12, 2019
7b8b588
修复_ignore_type的bug.
yhcc Jun 12, 2019
0f7c732
update embedding.py
xuyige Jun 12, 2019
a366f15
update documents in embedding.py
xuyige Jun 12, 2019
e4b0782
增加requests依赖
yhcc Jun 13, 2019
4124b38
Update quickstart.rst
yhcc Jun 13, 2019
ba9c591
Merge remote-tracking branch 'origin/dev0.5.0' into batch
Jun 13, 2019
e5e4381
add fasttext embedding
xuyige Jun 13, 2019
1a4c3c2
fix some bugs in test
xuyige Jun 13, 2019
acf18e2
修改DataSet split的一个注释错误
yhcc Jun 13, 2019
fabc692
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jun 13, 2019
7564818
[unstable] change Batch to torch's DataLoader
Jun 15, 2019
efe3574
Merge remote-tracking branch 'origin/dev0.5.0' into batch
Jun 15, 2019
17b5fd0
1. 删除Trainer中对train_data必须为DataSet的assert
yhcc Jun 15, 2019
4b5113c
prefecth变更为deprecated warning;
yhcc Jun 15, 2019
97b5909
Merge pull request #3 from fastnlp/dev0.5.0
lyhuang18 Jun 16, 2019
c78811f
add TC/MTL16Loader
lyhuang18 Jun 16, 2019
0e6a2a7
Merge pull request #168 from lyhuang18/lyhuang-reproduction
lyhuang18 Jun 16, 2019
30b012a
修复metric和loss在映射时出现重复同名输入时会覆盖的bug
yhcc Jun 17, 2019
95e8e2a
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jun 17, 2019
66f5139
Merge pull request #166 from fastnlp/batch
yhcc Jun 17, 2019
839d712
增强field中的value_count支持对nested的field的支持
yhcc Jun 17, 2019
2f5d896
1. 适配将Batch修改为pytorch的DataLoader的修改
yhcc Jun 17, 2019
3938856
update matching.py
xuyige Jun 17, 2019
342b702
Merge remote-tracking branch 'origin/dev0.5.0' into dev0.5.0
xuyige Jun 17, 2019
93620e7
update framework of matching
xuyige Jun 17, 2019
9a8fe42
新增NER的数据加载与模型代码; 修改metric中的typo; 修改LSTM中的默认初始化将forget gate设置为1.
yhcc Jun 18, 2019
4d138ed
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jun 18, 2019
4533427
sequence labeling更新
yhcc Jun 19, 2019
15c7c07
fix embed_loader
Jun 19, 2019
a137038
修复ELMO与LSTM无法使用nn.DataParallel的问题
yhcc Jun 19, 2019
c4e131a
重新修改ELMO与LSTM DataParallel的问题
yhcc Jun 19, 2019
1167d3b
再次修改elmo的dataparallel问题
yhcc Jun 19, 2019
76e2330
增加seq_len_to_mask对多卡场景的支持
yhcc Jun 19, 2019
8a766f0
seq_len_to_mask修改为直接使用max_len而不再和句中最大长度对比
yhcc Jun 19, 2019
6b9bc00
LSTM中修复潜在的DataParallel可能存在的问题, 并且删除init_method参数
yhcc Jun 19, 2019
0f4cf30
LSTM修改错误
yhcc Jun 19, 2019
8f7ed07
1. 在vocabulary的from_dataset中增加no_create_entry_dataset选项,用于传递dev和test
yhcc Jun 21, 2019
e57b8e4
seq_len_to_mask修复测试失败的问题
yhcc Jun 21, 2019
d34c739
Update embedding.py
xuyige Jun 22, 2019
0a2122a
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
xuyige Jun 22, 2019
4b0c26d
1.修复BertEmbedding中的bug; 2. 修复Batch, Field在进行类型转换时的bug
yhcc Jun 23, 2019
5d08775
conflict merge
yhcc Jun 23, 2019
d1f531c
update matching dataloader in reproduction/matching
xuyige Jun 23, 2019
3593f0a
fix bugs in matching dataloader
xuyige Jun 23, 2019
4d9eb7c
update framework of matching
xuyige Jun 23, 2019
e913734
解决在DataParallel模型场景下无法进行参数匹配的问题
yhcc Jun 23, 2019
add8edf
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jun 23, 2019
39dd086
1.修改CrossEntropyLoss中存在的反直觉bug; 2.更新sequence labeling
yhcc Jun 24, 2019
3710a30
Merge pull request #2 from fastnlp/dev0.5.0
dqwang122 Jun 24, 2019
43d3380
1.修复Trainer初始化的多device bug; 2.在CrossEntropyLoss中增加seq_len
yhcc Jun 24, 2019
79762c4
Add Summarization framework
dqwang122 Jun 24, 2019
e0b23b1
update data loader of matching
xuyige Jun 24, 2019
238d4fb
Merge remote-tracking branch 'origin/dev0.5.0' into dev0.5.0
xuyige Jun 24, 2019
bc5e071
Delete matching.py
xuyige Jun 24, 2019
50faa93
add RTE and QNLI loader
xuyige Jun 25, 2019
40c4d21
修改staticEmbedding的初始化方式,显示通过这种初始化在esmi上的snli更容易达到88的test acc
yhcc Jun 26, 2019
8b8d184
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
yhcc Jun 26, 2019
9c1b491
1.修复trainer中潜在多步更新bug; 2. LSTM的数据并行修改;3. embed_loader中bug修复, 且允许手动初始化;
yhcc Jun 30, 2019
2e4d84d
Merge pull request #5 from fastnlp/dev0.5.0
lyhuang18 Jun 30, 2019
15d9581
fix a bug in predictor
xuyige Jun 30, 2019
fcc5a9f
Merge pull request #7 from fastnlp/dev0.5.0
lyhuang18 Jun 30, 2019
5f19601
支持predict数据并行
yhcc Jun 30, 2019
f68b2c5
Tester支持predict数据并行
yhcc Jun 30, 2019
3c98487
Tester数据并行
yhcc Jun 30, 2019
406e1f7
fix and improve star_trans on SST
Jun 30, 2019
205f8fe
set random seed
Jun 30, 2019
204bb06
fix random process
Jul 1, 2019
610791a
update Readme.md
dqwang122 Jul 1, 2019
cdcc235
Merge pull request #3 from fastnlp/dev0.5.0
dqwang122 Jul 1, 2019
b0fe264
Tester中predict function的DataParallel并行
yhcc Jul 1, 2019
999f8ac
增加joint_cws_parse的代码
yhcc Jul 1, 2019
2858ff8
更新data folder读取代码
yhcc Jul 1, 2019
ba34a56
Create README.md
xuyige Jul 1, 2019
9b31e6e
ner的readme更新
yhcc Jul 2, 2019
8e7a604
update documents in predictor
xuyige Jul 2, 2019
1bc780a
update framework in matching tasks
xuyige Jul 2, 2019
84b1889
1.增加AdamW的optimizer;2.修复Trainer中metric_key的bug;3.静态embedding初始化修改;4.C…
yhcc Jul 3, 2019
4f91fb1
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jul 3, 2019
aa5f67e
first commit on tutorials branch
WillQvQ Jul 4, 2019
1ccc730
basic framework of docs folder
WillQvQ Jul 4, 2019
6e6a311
make tutorials folder
WillQvQ Jul 4, 2019
25ff06b
Update MatchingDataLoader.py
xuyige Jul 4, 2019
fff9490
修复GradientClip在update_every的场景下会更新错误的问题
yhcc Jul 4, 2019
29ca17d
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
yhcc Jul 4, 2019
8f729b6
merge matching loader to fastNLP package
xuyige Jul 4, 2019
00cf982
fix a bug in matching loader
xuyige Jul 4, 2019
63e00bd
add more detail in README.md
WillQvQ Jul 5, 2019
1fef29e
修改docs/source/user/tutorials.rst和docs/source/tutorials/tutorial_1_bat…
zide05 Jul 5, 2019
3e37356
Merge pull request #173 from zide05/tutorials
xuyige Jul 5, 2019
089009f
大幅度更新:
xuyige Jul 5, 2019
66a7cf0
fix bug in test
xuyige Jul 5, 2019
a40f57a
修复Vocabulary在建好词表之后新加入词导致的pad index错乱的问题
yhcc Jul 6, 2019
0af7193
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
yhcc Jul 6, 2019
ba2732a
Merge pull request #171 from QipengGuo/dev0.5.0
xpqiu Jul 6, 2019
32917da
add dpcnn
Jun 27, 2019
f1adb0f
add ID-CNN
Jul 4, 2019
372496c
update model & dataloader in text_classification
Jul 4, 2019
c5fc29d
-update DPCNN & train script
Jul 6, 2019
b02a91e
[add] dataloader: yelp/sst2/IMDB/MTL16
SrWYG Jul 6, 2019
dcb8746
Merge branch 'dev0.5.0' into master
SrWYG Jul 6, 2019
86ba01d
Merge pull request #1 from choosewhatulike/master
SrWYG Jul 6, 2019
c7463cf
[verify] yelpdataloader
SrWYG Jul 6, 2019
c07cbd7
指代消解源码
Xiaoxiong-Liu Jul 6, 2019
fb0ce6c
Merge pull request #174 from Xiaoxiong-Liu/dev0.5.0
xuyige Jul 6, 2019
c35c060
Merge pull request #9 from fastnlp/dev0.5.0
lyhuang18 Jul 7, 2019
d05aca6
TC/LSTM
lyhuang18 Jul 7, 2019
368733d
Merge branch 'dev0.5.0' into lyhuang-reproduction
SrWYG Jul 7, 2019
1faeafe
Merge pull request #2 from lyhuang18/lyhuang-reproduction
SrWYG Jul 7, 2019
f6bba93
[verify] yelpdataloader
SrWYG Jul 7, 2019
2a1d5dc
Add cntn model for matching.
ohlionel Jul 7, 2019
03f3e78
Merge pull request #175 from ohlionel/dev0.5.0
xuyige Jul 7, 2019
f369778
[verify] sstdataloader add sst2
SrWYG Jul 7, 2019
d4fa698
add dpcnn
Jun 27, 2019
4e3fba5
add ID-CNN
Jul 4, 2019
4b5713b
update model & dataloader in text_classification
Jul 4, 2019
4272778
-update DPCNN & train script
Jul 6, 2019
451af53
- update dpcnn
Jul 7, 2019
7affe1f
Merge branch 'master' of https://github.com/choosewhatulike/fastNLP i…
Jul 7, 2019
1d4e996
- update star-transformer README
Jul 7, 2019
e6dd7ba
Merge branch 'dev0.5.0' of https://github.com/SrWYG/fastNLP into pr
Jul 7, 2019
5dc43c6
create load dataset tutorial
xuyige Jul 7, 2019
48d6f38
Merge pull request #10 from SrWYG/dev0.5.0
lyhuang18 Jul 7, 2019
5d9e064
text_classfication
lyhuang18 Jul 7, 2019
46c82a7
text_classfication
lyhuang18 Jul 7, 2019
8156f3c
结果
lyhuang18 Jul 7, 2019
8f78bf5
readme格式修改
lyhuang18 Jul 7, 2019
2c9a6e0
1. 修改ELMO加载allennlp的权重;
yhcc Jul 8, 2019
a4bd424
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jul 8, 2019
e867641
修改_elmo.py的权重加载
yhcc Jul 8, 2019
cafe396
Merge pull request #3 from choosewhatulike/master
SrWYG Jul 8, 2019
eb00da2
Merge pull request #4 from lyhuang18/lyhuang-reproduction
SrWYG Jul 8, 2019
4687b37
[verify] readme
SrWYG Jul 8, 2019
ca94a00
Merge pull request #177 from SrWYG/dev0.5.0
xuyige Jul 8, 2019
d8bd40d
[verify] sst2loader use spacy tokenizer
SrWYG Jul 8, 2019
1cc115e
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
yhcc Jul 8, 2019
191af01
[verify] sst2loader/IMDB use spacy tokenizer
SrWYG Jul 8, 2019
248eefe
add BertSum
Jul 8, 2019
1f4cd0c
Merge pull request #178 from SrWYG/dev0.5.0
xuyige Jul 8, 2019
e05ccb8
Merge pull request #179 from maszhongming/dev0.5.0
xuyige Jul 8, 2019
b445ea6
删除过期的reproduction内容
xuyige Jul 8, 2019
b3d6acf
改成使用SST数据集的batch,loss,optimizer教程
zide05 Jul 8, 2019
f861dcc
Merge pull request #180 from zide05/tutorials
WillQvQ Jul 8, 2019
8f6de5b
建立tutorials的目录
WillQvQ Jul 8, 2019
602efc4
Merge pull request #12 from fastnlp/dev0.5.0
lyhuang18 Jul 8, 2019
9130929
[remove] 过期的SSTloader
SrWYG Jul 8, 2019
bdda4d9
Delete sstLoader.py
SrWYG Jul 8, 2019
ba47fb8
[verify] sst2loader
SrWYG Jul 8, 2019
0476b28
Delete SSTLoader.py
lyhuang18 Jul 8, 2019
83381d2
Merge remote-tracking branch 'origin/dev0.5.0' into dev0.5.0
SrWYG Jul 8, 2019
8cd9ea1
[verify] sst2loader
SrWYG Jul 8, 2019
124edb5
Yelp_f的结果
lyhuang18 Jul 8, 2019
3cba5a3
fix a bug
WillQvQ Jul 8, 2019
e38777b
Merge pull request #5 from lyhuang18/lyhuang-reproduction
SrWYG Jul 8, 2019
389966d
Merge pull request #181 from SrWYG/dev0.5.0
xuyige Jul 8, 2019
43fac84
1. 增加learning rate WarmupCallback; 2.增加模型保存的callback; 3. utils中增加对bio…
yhcc Jul 8, 2019
99d3230
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
yhcc Jul 8, 2019
16ddcb7
Delete dataset_loader.py
dqwang122 Jul 8, 2019
bb373fd
delete my fastnlp/io/dataset_loader.py and add test file
dqwang122 Jul 8, 2019
8707f7f
Merge branch 'master' of https://github.com/brxx122/fastNLP
dqwang122 Jul 8, 2019
84e659b
add original dataset loader
dqwang122 Jul 8, 2019
9fe06df
Merge pull request #5 from fastnlp/dev0.5.0
dqwang122 Jul 8, 2019
c588ba7
Merge BertSum and reorganize Summarization Task
dqwang122 Jul 8, 2019
5bb6c22
Delete removed files
dqwang122 Jul 8, 2019
f33008a
Merge pull request #182 from brxx122/master
xuyige Jul 8, 2019
d70aa96
大幅度更新:1、更新requirements;2、将modules.aggregator的内容移至modules.encoder;3、将S…
xuyige Jul 8, 2019
a39dafa
fix bug in tests
xuyige Jul 8, 2019
1761f59
fix bug in test code
xuyige Jul 8, 2019
a83cee0
fix bug in load dataset test code
xuyige Jul 8, 2019
be3f5ee
fix bug in load dataset test code
xuyige Jul 8, 2019
d3f81dd
final fix bug in load dataset test code
xuyige Jul 8, 2019
1babf53
Vocabulary中no_create_entry的bug修复
yhcc Jul 9, 2019
9e863bb
batch分离出来
zide05 Jul 9, 2019
da88a0d
batch分离出来-修改
zide05 Jul 9, 2019
e876082
修复Embedding中的bug
yhcc Jul 9, 2019
9ec1570
sequence labeling 的dataloader
yhcc Jul 9, 2019
488ce6b
Update test_dataLoader.py
xuyige Jul 9, 2019
14778ee
[bug fix] sst loader, star-transformer
Jul 9, 2019
834443e
Merge branch 'tutorials' into tutorials
zide05 Jul 9, 2019
79fd7f9
Merge pull request #183 from zide05/tutorials
xuyige Jul 9, 2019
2c00c1a
删除elmo对h5py的依赖
yhcc Jul 9, 2019
000deb2
Merge pull request #184 from choosewhatulike/master
xuyige Jul 9, 2019
9e81eae
修复elmo的bug
yhcc Jul 9, 2019
bdc7b18
Merge branch 'dev0.5.0' of github.com:fastnlp/fastNLP into dev0.5.0
yhcc Jul 9, 2019
68f719e
[add] callback tutorial
Jul 10, 2019
97c7ba3
添加了mwan模型,并稍微修改了matching dataloader
FFTYYY Jul 10, 2019
aec0414
finish callback tutorial
Jul 10, 2019
eb01a5e
Merge pull request #185 from FFTYYY/dev0.5.0
xuyige Jul 10, 2019
5c80c6f
将DataInfo修改为DataBundle
yhcc Jul 10, 2019
dc8ae56
add embedding tutorial
xuyige Jul 10, 2019
83f9cb1
update README.md
xuyige Jul 10, 2019
28d9ae0
更新一些过时代码
xuyige Jul 10, 2019
7d07b38
fix bug in matching DataLoader
xuyige Jul 10, 2019
2eba991
fix matching DataLoader test code
xuyige Jul 10, 2019
f1463d8
Merge pull request #186 from fastnlp/dev0.5.0
WillQvQ Jul 11, 2019
6b6a47c
dataset tutorials
WillQvQ Jul 11, 2019
16e2474
列举了目前暴露的 modules 和 models,需要后续增加更多。
WillQvQ Jul 11, 2019
ee1e470
大幅更新文档:
WillQvQ Jul 11, 2019
be7ffcf
确定版本号为0.4.5
WillQvQ Jul 11, 2019
d54122f
Merge pull request #6 from fastnlp/dev0.5.0
SrWYG Jul 11, 2019
3a74636
Merge pull request #187 from fastnlp/tutorials
WillQvQ Jul 11, 2019
efea6ce
[verify] train_char_cnn optimization
SrWYG Jul 11, 2019
ca145d2
fix import bug
Jul 11, 2019
2610c20
Merge pull request #6 from fastnlp/dev0.5.0
dqwang122 Jul 11, 2019
6c7009d
tutorials 标题
WillQvQ Jul 11, 2019
1dd8db7
Merge pull request #188 from SrWYG/dev0.5.0
xuyige Jul 11, 2019
897ed8a
modify readme and update trainer.py
dqwang122 Jul 11, 2019
ec08564
add rouge in readme
dqwang122 Jul 11, 2019
2c1a830
Merge pull request #190 from brxx122/master
xuyige Jul 11, 2019
df0bc2a
根据代码修改了文档
WillQvQ Jul 11, 2019
2e523c6
更新fastNLP框架图及流程图
xuyige Jul 11, 2019
579bdb1
更新DataSetLoader的文档以及对应教程
xuyige Jul 11, 2019
9f3968d
Merge remote-tracking branch 'origin/dev0.5.0' into dev0.5.0
xuyige Jul 11, 2019
2736570
更改文档内容
xuyige Jul 11, 2019
6cf1a85
rename Batch to DataSetIter in enas_trainer
xuyige Jul 11, 2019
570b214
增加fastNLP.embeddings模块并修改对应的现有代码以适配fastNLP.embeddings
xuyige Jul 11, 2019
76198ac
update README.md: add description in fastNLP.embeddings
xuyige Jul 11, 2019
327833a
fix bug in test code
xuyige Jul 11, 2019
65f9285
add testing code in stack embedding
xuyige Jul 11, 2019
4df322e
Merge pull request #14 from fastnlp/dev0.5.0
lyhuang18 Jul 11, 2019
bedb792
将test_embed_loader的almost equal范围稍稍调大一些,以防止偶然的测试不通过
xuyige Jul 11, 2019
defcaae
将test_embed_loader的almost equal范围稍稍调大一些,以防止偶然的测试不通过
xuyige Jul 11, 2019
0d48770
Merge pull request #17 from fastnlp/dev0.5.0
lyhuang18 Jul 11, 2019
ffbba0f
修改代码以适配新embeddings模块
lyhuang18 Jul 11, 2019
0269250
Merge pull request #191 from lyhuang18/lyhuang-reproduction
lyhuang18 Jul 11, 2019
807a8f1
1. BucketSampler不需要自己传入batch_size了,由Trainer自动设置
yhcc Jul 12, 2019
0a33a32
Merge branch 'dev0.5.0' of https://github.com/fastnlp/fastNLP into de…
yhcc Jul 12, 2019
391793a
最新的docs结构
WillQvQ Jul 12, 2019
c99f02a
API文档入口的介绍
WillQvQ Jul 12, 2019
d6ae241
decoder部分的别名
WillQvQ Jul 12, 2019
a09cf51
modules入口的介绍和dropout的文档
WillQvQ Jul 12, 2019
f3a9fc5
encoder里面的结构和文档
WillQvQ Jul 12, 2019
dcc6d5d
models的文档结构和别名
WillQvQ Jul 12, 2019
ce72936
:maxdepth: 1
WillQvQ Jul 12, 2019
fc0f86a
io、data_loader的文档结构和别名
WillQvQ Jul 12, 2019
9f681dc
fix tutorial typo
Jul 12, 2019
90c6454
修改了文档首页
WillQvQ Jul 12, 2019
af4fd46
修改了io 模块内子模块的顺序
WillQvQ Jul 12, 2019
39f3acc
增加Embedding的文档
yhcc Jul 12, 2019
f3e19dd
erge conflict in bert
yhcc Jul 12, 2019
98ebc3c
在README.md添加版本说明
xuyige Jul 12, 2019
1fed365
check了 embeddings 的文档
WillQvQ Jul 12, 2019
25421d1
Merge branch 'master' into dev0.5.0
xuyige Jul 12, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
16 changes: 16 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
.gitignore

.DS_Store
.ipynb_checkpoints
*.pyc
__pycache__
*.swp
.vscode/
.idea/**

caches

# fitlog
.fitlog
logs/
.fitconfig
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ install:
- pip install pytest-cov
# command to run tests
script:
- pytest --cov=./
- pytest --cov=./ test/

after_success:
- bash <(curl -s https://codecov.io/bash)
77 changes: 45 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,48 +6,69 @@
![Hex.pm](https://img.shields.io/hexpm/l/plug.svg)
[![Documentation Status](https://readthedocs.org/projects/fastnlp/badge/?version=latest)](http://fastnlp.readthedocs.io/?badge=latest)

fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地完成一个命名实体识别(NER)、中文分词或文本分类任务; 也可以使用他构建许多复杂的网络模型,进行科研。它具有如下的特性:
fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地完成一个序列标注([NER](reproduction/seqence_labelling/ner)、POS-Tagging等)、中文分词、[文本分类](reproduction/text_classification)、[Matching](reproduction/matching)、[指代消解](reproduction/coreference_resolution)、[摘要](reproduction/Summarization)等任务; 也可以使用它构建许多复杂的网络模型,进行科研。它具有如下的特性:

- 统一的Tabular式数据容器,让数据预处理过程简洁明了。内置多种数据集的DataSet Loader,省去预处理代码。
- 各种方便的NLP工具,例如预处理embedding加载; 中间数据cache等;
- 详尽的中文文档以供查阅;
- 统一的Tabular式数据容器,让数据预处理过程简洁明了。内置多种数据集的DataSet Loader,省去预处理代码;
- 多种训练、测试组件,例如训练器Trainer;测试器Tester;以及各种评测metrics等等;
- 各种方便的NLP工具,例如预处理embedding加载(包括ELMo和BERT); 中间数据cache等;
- 详尽的中文[文档](https://fastnlp.readthedocs.io/)、[教程](https://fastnlp.readthedocs.io/zh/latest/user/tutorials.html)以供查阅;
- 提供诸多高级模块,例如Variational LSTM, Transformer, CRF等;
- 封装CNNText,Biaffine等模型可供直接使用;
- 在序列标注、中文分词、文本分类、Matching、指代消解、摘要等任务上封装了各种模型可供直接使用,详细内容见 [reproduction](reproduction) 部分;
- 便捷且具有扩展性的训练器; 提供多种内置callback函数,方便实验记录、异常捕获等。


## 安装指南

fastNLP 依赖如下包:
fastNLP 依赖以下包:

+ numpy
+ torch>=0.4.0
+ tqdm
+ nltk
+ numpy>=1.14.2
+ torch>=1.0.0
+ tqdm>=4.28.1
+ nltk>=3.4.1
+ requests
+ spacy

其中torch的安装可能与操作系统及 CUDA 的版本相关,请参见 PyTorch 官网 。
在依赖包安装完成的情况,您可以在命令行执行如下指令完成安装
其中torch的安装可能与操作系统及 CUDA 的版本相关,请参见 [PyTorch 官网](https://pytorch.org/)
在依赖包安装完成后,您可以在命令行执行如下指令完成安装

```shell
pip install fastNLP
python -m spacy download en
```

目前使用pip安装fastNLP的版本是0.4.1,有较多功能仍未更新,最新内容以master分支为准。
fastNLP0.5.0版本将在近期推出,请密切关注。

## 参考资源

- [文档](https://fastnlp.readthedocs.io/zh/latest/)
- [源码](https://github.com/fastnlp/fastNLP)
## fastNLP教程

- [0. 快速入门](https://fastnlp.readthedocs.io/zh/latest/user/quickstart.html)
- [1. 使用DataSet预处理文本](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_1_data_preprocess.html)
- [2. 使用DataSetLoader加载数据集](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_2_load_dataset.html)
- [3. 使用Embedding模块将文本转成向量](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_3_embedding.html)
- [4. 动手实现一个文本分类器I-使用Trainer和Tester快速训练和测试](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_4_loss_optimizer.html)
- [5. 动手实现一个文本分类器II-使用DataSetIter实现自定义训练过程](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_5_datasetiter.html)
- [6. 快速实现序列标注模型](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_6_seq_labeling.html)
- [7. 使用Modules和Models快速搭建自定义模型](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_7_modules_models.html)
- [8. 使用Metric快速评测你的模型](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_8_metrics.html)
- [9. 使用Callback自定义你的训练过程](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_9_callback.html)
- [10. 使用fitlog 辅助 fastNLP 进行科研](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_10_fitlog.html)



## 内置组件

大部分用于的 NLP 任务神经网络都可以看做由编码(encoder)、聚合(aggregator)、解码(decoder)三种模块组成。
大部分用于的 NLP 任务神经网络都可以看做由词嵌入(embeddings)和两种模块:编码器(encoder)、解码器(decoder)组成。

以文本分类任务为例,下图展示了一个BiLSTM+Attention实现文本分类器的模型流程图:


![](./docs/source/figures/text_classification.png)

fastNLP 在 modules 模块中内置了三种模块的诸多组件,可以帮助用户快速搭建自己所需的网络。 三种模块的功能和常见组件如下:
fastNLP 在 embeddings 模块中内置了几种不同的embedding:静态embedding(GloVe、word2vec)、上下文相关embedding
(ELMo、BERT)、字符embedding(基于CNN或者LSTM的CharEmbedding)

与此同时,fastNLP 在 modules 模块中内置了两种模块的诸多组件,可以帮助用户快速搭建自己所需的网络。 两种模块的功能和常见组件如下:

<table>
<tr>
Expand All @@ -57,29 +78,17 @@ fastNLP 在 modules 模块中内置了三种模块的诸多组件,可以帮助
</tr>
<tr>
<td> encoder </td>
<td> 将输入编码为具有具 有表示能力的向量 </td>
<td> 将输入编码为具有具有表示能力的向量 </td>
<td> embedding, RNN, CNN, transformer
</tr>
<tr>
<td> aggregator </td>
<td> 从多个向量中聚合信息 </td>
<td> self-attention, max-pooling </td>
</tr>
<tr>
<td> decoder </td>
<td> 将具有某种表示意义的 向量解码为需要的输出 形式 </td>
<td> 将具有某种表示意义的向量解码为需要的输出形式 </td>
<td> MLP, CRF </td>
</tr>
</table>


## 完整模型
fastNLP 为不同的 NLP 任务实现了许多完整的模型,它们都经过了训练和测试。

你可以在以下两个地方查看相关信息
- [介绍](reproduction/)
- [源码](fastNLP/models/)

## 项目结构

![](./docs/source/figures/workflow.png)
Expand All @@ -93,7 +102,7 @@ fastNLP的大致工作流程如上图所示,而项目结构如下:
</tr>
<tr>
<td><b> fastNLP.core </b></td>
<td> 实现了核心功能,包括数据处理组件、训练器、测速器等 </td>
<td> 实现了核心功能,包括数据处理组件、训练器、测试器等 </td>
</tr>
<tr>
<td><b> fastNLP.models </b></td>
Expand All @@ -103,6 +112,10 @@ fastNLP的大致工作流程如上图所示,而项目结构如下:
<td><b> fastNLP.modules </b></td>
<td> 实现了用于搭建神经网络模型的诸多组件 </td>
</tr>
<tr>
<td><b> fastNLP.embeddings </b></td>
<td> 实现了将序列index转为向量序列的功能,包括读取预训练embedding等 </td>
</tr>
<tr>
<td><b> fastNLP.io </b></td>
<td> 实现了读写功能,包括数据读入,模型读写等 </td>
Expand Down
3 changes: 3 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ apidoc:
server:
cd build/html && python -m http.server

dev:
rm -rf build/html && make html && make server

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
Expand Down
41 changes: 41 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# 快速入门 fastNLP 文档编写

本教程为 fastNLP 文档编写者创建,文档编写者包括合作开发人员和文档维护人员。您在一般情况下属于前者,
只需要了解整个框架的部分内容即可。

## 合作开发人员

FastNLP的文档使用基于[reStructuredText标记语言](http://docutils.sourceforge.net/rst.html)的
[Sphinx](http://sphinx.pocoo.org/)工具生成,由[Read the Docs](https://readthedocs.org/)网站自动维护生成。
一般开发者只要编写符合reStructuredText语法规范的文档并通过[PR](https://help.github.com/en/articles/about-pull-requests),
就可以为fastNLP的文档贡献一份力量。

如果你想在本地编译文档并进行大段文档的编写,您需要安装Sphinx工具以及sphinx-rtd-theme主题:
```bash
fastNLP/docs> pip install sphinx
fastNLP/docs> pip install sphinx-rtd-theme
```
然后在本目录下执行 `make dev` 命令。该命令只支持Linux和MacOS系统,期望看到如下输出:
```bash
fastNLP/docs> make dev
rm -rf build/html && make html && make server
Running Sphinx v1.5.6
making output directory...
......
Build finished. The HTML pages are in build/html.
cd build/html && python -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
```
现在您浏览器访问 http://localhost:8000/ 查看文档。如果你在远程服务器尚进行工作,则访问地址为 http://{服务器的ip地址}:8000/ 。
但您必须保证服务器的8000端口是开放的。如果您的电脑或远程服务器的8000端口被占用,程序会顺延使用8001、8002……等端口。
当你结束访问时,您可以使用Control(Ctrl) + C 来结束进程。

我们在[这里](./source/user/example.rst)列举了fastNLP文档经常用到的reStructuredText语法(网页查看请结合Raw模式),
您可以通过阅读它进行快速上手。FastNLP大部分的文档都是写在代码中通过Sphinx工具进行抽取生成的,
您还可以参考这篇[未完成的文章](./source/user/docs_in_code.rst)了解代码内文档编写的规范。

## 文档维护人员

文档维护人员需要了解 Makefile 中全部命令的含义,并了解到目前的文档结构
是在 sphinx-apidoc 自动抽取的基础上进行手动修改得到的。
文档维护人员应进一步提升整个框架的自动化程度,并监督合作开发人员不要破坏文档项目的整体结构。
36 changes: 0 additions & 36 deletions docs/make.bat

This file was deleted.

2 changes: 0 additions & 2 deletions docs/quick_tutorial.md

This file was deleted.

4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@
author = 'xpqiu'

# The short X.Y version
version = '0.4'
version = '0.4.5'
# The full version, including alpha/beta/rc tags
release = '0.4'
release = '0.4.5'

# -- General configuration ---------------------------------------------------

Expand Down
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.batch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.batch
==================

.. automodule:: fastNLP.core.batch
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.callback.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.callback
=====================

.. automodule:: fastNLP.core.callback
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.const.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.const
==================

.. automodule:: fastNLP.core.const
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.dataset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.dataset
====================

.. automodule:: fastNLP.core.dataset
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.field.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.field
==================

.. automodule:: fastNLP.core.field
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.instance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.instance
=====================

.. automodule:: fastNLP.core.instance
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.losses.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.losses
===================

.. automodule:: fastNLP.core.losses
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.metrics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.metrics
====================

.. automodule:: fastNLP.core.metrics
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.optimizer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.optimizer
======================

.. automodule:: fastNLP.core.optimizer
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
9 changes: 4 additions & 5 deletions docs/source/fastNLP.core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ fastNLP.core
============

.. automodule:: fastNLP.core
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:

子模块
----------

.. toctree::
:titlesonly:
:maxdepth: 1

fastNLP.core.batch
fastNLP.core.callback
Expand All @@ -26,4 +26,3 @@ fastNLP.core
fastNLP.core.trainer
fastNLP.core.utils
fastNLP.core.vocabulary

6 changes: 3 additions & 3 deletions docs/source/fastNLP.core.sampler.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ fastNLP.core.sampler
====================

.. automodule:: fastNLP.core.sampler
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
Loading