BERTtextclassifier

BERTtextclassifier is a text classifier based on pre-trained BERT model.

Requirement

python3
pytorch==1.6.0
torchtext==0.8.0
pandas
matplotlib
transformers==3.0.0
sklearn
seaborn

Usage

./BERTtextclassifier.sh [-i infile]
                        [-m model_path]
                        <-d data_outdir>
                        <-o model_outdir>
                        <-l max_length>
                        <-b batch_size>
                        <-n num_epochs>
                        <-r learning_rate>

-i: filename of raw input text. Input text shoud be in TSV format with 4 colomns as follows:
IDtitlecontentlabel
labels should be intergers.
-m: directory of pre-trained BERT model.
-d: directory for output preprocessed text data. Default=./data_preprocessed
-o: directory for output post-trained model. Default=./model
-l: max length of sequences. Default=150
-b: batch size for training, valid and test data. Default=4
-n: number of epochs to use. Default=5
-r: learning rate of model. Default=2e-5

Pre-trained Model

Pre-trained BERT model should be download first for training.
Example
To download bioBERT pretrained model. use folloing command:

git lfs install
git clone https://huggingface.co/dmis-lab/biobert-v1.1

Then pre-trained bioBERT model will be download in ./biobert-v1.1.

Example

With downloaded bioBERT model, use following command to test:

./BERTtextclassifier.sh -i testdata/test.tsv -m biobert-v1.1

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
pipeline		pipeline
testdata		testdata
.gitignore		.gitignore
BERTtextclassifier.sh		BERTtextclassifier.sh
LICENSE		LICENSE
README.md		README.md
run_pipeline.py		run_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BERTtextclassifier

Requirement

Usage

Pre-trained Model

Example

About

Uh oh!

Releases

Packages

Languages

License

liuzy1992/berttextclassifier

Folders and files

Latest commit

History

Repository files navigation

BERTtextclassifier

Requirement

Usage

Pre-trained Model

Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages