Skip to content

gaushh/optimized-bert

Repository files navigation

optimized-bert

About The Project

This project is an attempt to democratize BERT by reducing its number of trainable parameters and in-turn making it faster to train and fine-tune. Since cosine similarity (computed in BERT while performing dot product for Query, Key and Value matrices) is prone to convex hull, we plan to replace cosine with other similarity metrics and check how that impacts reduced model dimensions.

We are benchmarking the model's performance on the following distance/similarity measures:

  • Cosine
  • Euclidean
  • Gaussian softmax

Due to constraint of compute resources, we are currently validating our hypothesis on just 1% book corpus data. We intend to increase the train data in the subsequent iterations.

Getting Started

  • Clone the repo
    git clone https://github.com/gaushh/optimized-bert.git
  • Create a virtual environment on your IDE

Now you can set everything up using the single shell script or following step-by-step instructions.

Single Script Setup

Run the shell script to set everything up with default configs and start pre-training BERT.

  • shell script
    sh setup.sh

Step By Step Setup

Here are the step by step instructions to setup the repo and in-turn understand the process.

  1. Install required packages.
    pip install -r requirements.txt
  2. Login to Weights and Biases
    wandb login --relogin 8c46e02a8d52f960fb349e009c5b6773c25b6957
  3. Writing config file
    cd helper
    python write_config.py
    cd ..
  4. Preparing dtataset
    cd src/data
    python dataset.py
    cd ../..
  5. Training tokenizer
    cd src/modelling
    python train_tokenizer.py
  6. Performing post-processing
    python preparation.py
  7. Starting model training
    python train_bert.py

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published