Skip to content

talrub/DeepLearningCourseProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Unveiling the effectiveness of the resurrecting RNNs

Tal Rubinstein

Supervised by Edo Cohen

Official repository of the project

Abstract: Capturing long range dependencies is a fundamental challenge in many machine learning tasks including natural language processing and time series analysis. In a recent series of works, State Space Models (SSMs) have emerged and proven to be extremely effective in modeling such dependencies, notably surpassing transformers in benchmarks such as Long Range Arena (LRA). At their core, SSMs are recurrent neural networks (RNNs) with a linear update to the hidden state, enabling efficient implementation and training on very long sequences. Many variants of SSMs have been proposed with different architectural designs. To this date, it is still unclear theoretically why SSMs are so effective. This paper empirically investigates the effect of different design choices on the optimization and generalization of SSMs.

Repository Organization

File name Content
/configs/table2/mnist_guess_rnn.yaml Configurations file for the different experiments
train_distributed_same_seeds.py Script to restore the results presented in the project's report

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published