Skip to content

Notebook (Demonstration) for training Distilbert on Glue, and uploading model to Huggingface.

Notifications You must be signed in to change notification settings

abhilash1910/DistilBERT--SQuAD-v1-Notebook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

DistilBERT--SQuAD-v1-Notebook

This Notebook contains set of instructions how to train DistilBert from Huggingface in Google Colab. Training is done on the SQuAD dataset. The model can be accessed via HuggingFace:

from transformers import AutoModelForQuestionAnswering,AutoTokenizer,pipeline
model=AutoModelForQuestionAnswering.from_pretrained('abhilash1910/distilbert-squadv1')
tokenizer=AutoTokenizer.from_pretrained('abhilash1910/distilbert-squadv1')
nlp_QA=pipeline('question-answering',model=model,tokenizer=tokenizer)
QA_inp={
    'question': 'What is the fund price of Huggingface in NYSE?',
    'context': 'Huggingface Co. has a total fund price of $19.6 million dollars'
}
result=nlp_QA(QA_inp)
result

The result is:

{'score': 0.38547369837760925,
 'start': 42,
 'end': 55,
 'answer': '$19.6 million'}

distilBERT

DistilBERT is a lighter version of BERT which uses 40% less size from BERT but retains 97% of its performance. The original paper can be found here.

Tips:

  • DistilBERT doesn’t have token_type_ids, you don’t need to indicate which token belongs to which segment. Just separate your segments with the separation token tokenizer.sep_token (or [SEP]).

  • DistilBERT doesn’t have options to select the input positions (position_ids input)

The architecture involves training a "student BERT" with reduced parameters from the pretrained "teacher BERT" (weight transfer):

About

Notebook (Demonstration) for training Distilbert on Glue, and uploading model to Huggingface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published