Skip to content

Latest commit

 

History

History
16 lines (7 loc) · 1.06 KB

README.md

File metadata and controls

16 lines (7 loc) · 1.06 KB

DEQ-Mnist

Deep Equilibrium model is a kind of implicit layer model. These layers(implicit layers) have shown impressive results on NLP and vision tasks. One of the advantages of the implicit layer is memory efficiency which is based on implicit differentiation.

This graph shows the memories that were used in our DEQ model when we didn't implement implicit differentiation:

alt text

This graph shows the memories when we were implemented implicit differentiation:

alt text

Since these models are based on implicit layers we need a fixed point solver to find the fixed point that satisfies the desired condition. For fixed-point solvers, I used Anderson acceleration and forward solver which is a forward pass layer that satisfies a condition.