Skip to content

transformers implement (architecture, task example, serving and more)

License

Notifications You must be signed in to change notification settings

xv44586/toolkit4nlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

toolkits for NLP

intent

为了方便自己学习与理解一些东西,实现一些自己的想法

Update info:

from toolkit4nlp.models import build_transformer_model
# 自己构造 embeddings_matrix,与vocabulary 对应
config_path = ''
checkpoint_path = ''
embeddings_matrix = None
nezha = build_transformer_model(
config_path=checkpoint_path,
checkpoint_path=checkpoint_path, 
model='nezha', external_embedding_size=100,
 external_embedding_weights=embeddings_matrix)
from toolkit4nlp.models import build_transformer_model
config_path = '/home/mingming.xu/pretrain/NLP/chinese_nezha_base/config.json'
checkpoint_path = '/home/mingming.xu/pretrain/NLP/chinese_nezha_base/model_base.ckpt'

model = build_transformer_model(config_path=config_path, checkpoint_path=checkpoint_path, model='nezha')
from toolkit4nlp.models import build_transformer_model
config_path = '/home/mingming.xu/pretrain/NLP/chinese_electra_base_L-12_H-768_A-12/config.json'
checkpoint_path = '/home/mingming.xu/pretrain/NLP/chinese_electra_base_L-12_H-768_A-12/electra_base.ckpt'


# lm
model = build_transformer_model(
  config_path=config_path,
  checkpoint_path=checkpoint_path,
  application='lm'
)

# unilm
model = build_transformer_model(
  config_path=config_path,
  checkpoint_path=checkpoint_path,
  application='unilm'
)
  • 2020.08.19 增加ELECTRA model,使用方法:
from toolkit4nlp.models import build_transformer_model


config_path = '/home/mingming.xu/pretrain/NLP/chinese_electra_base_L-12_H-768_A-12/config.json'
checkpoint_path = '/home/mingming.xu/pretrain/NLP/chinese_electra_base_L-12_H-768_A-12/electra_base.ckpt'

model =  build_transformer_model(
  config_path=config_path,
  checkpoint_path=checkpoint_path,
  model='electra',
)
from toolkit4nlp.tokenizers import Tokenizer
vocab = ''
tokenizer = Tokenizer(vocab, do_lower_case=True)
tokenizer.encode('我爱你中国')    
  • 2020.07.16 完成bert加载预训练权重,用法:
from toolkit4nlp.models import build_transformer_model

config_path = ''
checkpoints_path = ''
model = build_transformer_model(config_path, checkpoints_path)

主要参考了bertbert4keras以及 keras_bert

About

transformers implement (architecture, task example, serving and more)

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages