Skip to content

Latest commit

 

History

History
248 lines (203 loc) · 8.72 KB

README.md

File metadata and controls

248 lines (203 loc) · 8.72 KB

optimized-bert

About The Project

This project is an attempt to democratize BERT by reducing its number of trainable parameters and in-turn making it faster to train and fine-tune. Since cosine similarity (computed in BERT while performing dot product for Query, Key and Value matrices) is prone to convex hull, we plan to replace cosine with other similarity metrics and check how that impacts reduced model dimensions.

We are benchmarking the model's performance on the following distance/similarity measures:

  • Cosine
  • Euclidean
  • Gaussian softmax

Due to constraint of compute resources, we are currently validating our hypothesis on just 1% book corpus data. We intend to increase the train data in the subsequent iterations.

Getting Started

  • Clone the repo
    git clone https://github.com/gaushh/optimized-bert.git
  • Create a virtual environment on your IDE

Now you can set everything up using the single shell script or following step-by-step instructions.

Single Script Setup

Run the shell script to set everything up with default configs and start pre-training BERT.

  • shell script
    sh setup.sh

Step By Step Setup

Here are the step by step instructions to setup the repo and in-turn understand the process.

  1. Install required packages.
    pip install -r requirements.txt
  2. Login to Weights and Biases
    wandb login --relogin 8c46e02a8d52f960fb349e009c5b6773c25b6957
  3. Writing config file
    cd helper
    python write_config.py
    cd ..
  4. Preparing dtataset
    cd src/data
    python dataset.py
    cd ../..
  5. Training tokenizer
    cd src/modelling
    python train_tokenizer.py
  6. Performing post-processing
    python preparation.py
  7. Starting model training
    python train_bert.py

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)