JAX implementation of 'Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training'
-
Updated
May 23, 2024 - Python
JAX implementation of 'Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training'
Super-Convergence on CIFAR10
Implementing and fine-tuning BERT for sentiment analysis, paraphrase detection, and semantic textual similarity tasks. Includes code, data, and detailed results.
sophia optimizer further projected towards flat areas of loss landscape
Natural Language Processing, Speech Dictation, API controllers
Sophia robot bringup and test expressions and animations
Add a description, image, and links to the sophia topic page so that developers can more easily learn about it.
To associate your repository with the sophia topic, visit your repo's landing page and select "manage topics."