Model architecture

QueensGambit edited this page Dec 27, 2018 · 2 revisions

Model architecture

In CrazyAra v0.2.0 a newly designed architecture was used which is called RISE for short.

It incorporates new ideas and techniques described in recent papers for Deep Learning in Computer Vision.

ResneXt He et al. - 2015 - Deep Residual Learning for Image Recognition.pdf - https://arxiv.org/pdf/1512.03385.pdf
Xie et al. - 2016 - Aggregated Residual Transformations for Deep Neurarl Networks - http://arxiv.org/abs/1611.05431
Inception Szegedy et al. - 2015 - Rethinking the Inception Architecture for ComputerVision - https://arxiv.org/pdf/1512.00567.pdf)
Szegedy et al. - 2016 - Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning - https://arxiv.org/pdf/1602.07261.pdf)
Squeeze Hu et al. - 2017 - Squeeze-and-Excitation Networks - https://arxiv.org/pdf/1709.01507.pdf)
Excitation Hu et al. - 2017 - Squeeze-and-Excitation Networks - https://arxiv.org/pdf/1709.01507.pdf)

The proposed model architecture has fewer parameters, faster inference and training time while maintaining an equal amount of depth compared to the architecture proposed by DeepMind (19 residual layers with 256 filters). On our 10,000 games benchmark dataset it achieved a lower validation error using the same learnig rate and optimizer settings.

RISE-Architecture (CrazyAra v0.2) Vanilla-Resnet Architecture(CrazyAra v0.1)
lr-schedule lr-schedule
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.