Mixture of Experts

Introduction

This is a basic implementation of the paper and basically is a toy implementation of the Mixture of Experts algorithm.

So the model basically consist of various expert models which specialize at a particular task rather than a single model being good at that task. And finally weights are assigned to the various experts using a gating network(kind of like attention) where more weight, as a result, is given to the expert good at the particular task in hand.

Running the code

The code has been tested for Python 3.7 and PyTorch v1.3.

For training the model

Clone the repository and go to the repo.

python main.py --training True    ### For training

python main.py --testing True     ### For testing

Apart from this, the various hyperparameter flags can also be seen from the main.py file and can be tweaked accordingly.

Code structure

main.py: Specification of various hyperparameters used during training, along with checkpoint location specifications.
train.py: Script for training(along with validating) the model and contains the whole training procedure.
test.py: Script for testing the already trained model.
model.py: Contains the architecture of model and the backbone used.
utils.py: Contains the various helper functions along with function for getting dataset.

Further things to be done

I am still not able to completely get the EM algorithm specified in the paper for optimizing the weights, the reason for which has also been specified in the utils.py file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mixture of Experts

Introduction

Running the code

Code structure

Further things to be done

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
main.py		main.py
model.py		model.py
test.py		test.py
train.py		train.py
utils.py		utils.py

aniket-agarwal1999/Mixture_of_Experts

Folders and files

Latest commit

History

Repository files navigation

Mixture of Experts

Introduction

Running the code

Code structure

Further things to be done

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages