Skip to content

Commit

Permalink
methods 4
Browse files Browse the repository at this point in the history
  • Loading branch information
kheyer committed Apr 25, 2019
1 parent 873596c commit 93c71de
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions Methods/Methods Long Form.ipynb
Expand Up @@ -393,7 +393,7 @@
"\n",
"### 5.4 Discriminative Learning Rates\n",
"\n",
"Different layers in the network encode different types of information [25]. In the context of transfer learning, different layers of the pre-trained model need to be fine tuned to different extents. This is done through the use of discriminative learning rates, introduced by [1]. With this technique, higher layers in the model are fine-tuned at higher learning rates compared to the lower layers of the model. Following [1], learning rates follow the function $\\eta^{l-1} = \\eta^{l}/2.6$.\n",
"Different layers in the network encode different types of information [25]. In the context of transfer learning, different layers of the pre-trained model need to be fine tuned to different extents. This is done through the use of discriminative learning rates, introduced by [1]. With this technique, higher layers in the model are fine-tuned at higher learning rates compared to the lower layers of the model. Following [1], learning rates follow the function $\\eta^{l-1} = \\frac{\\eta^{l}}{2.6}$.\n",
"\n",
"Discriminative learning rates are used in fune tuning the language model and training the classification model.\n",
"\n",
Expand Down Expand Up @@ -435,7 +435,7 @@
"\n",
"### 5.8 Language Model Fine Tuning\n",
"\n",
"Language Model fine tuning on a classification corpus is done using the One Cycle policy with discriminative learning rates. Discriminative learning rates follow the form $\\eta^{l-1} = \\eta^{l}/2.6$. Learning rates depend on the dataset but typically range from $5e-4$ and $5e-3$. The model is trained using cross entropy loss.\n",
"Language Model fine tuning on a classification corpus is done using the One Cycle policy with discriminative learning rates. Discriminative learning rates follow the form $\\eta^{l-1} = \\frac{\\eta^{l}}{2.6}$. Learning rates depend on the dataset but typically range from $5e-4$ and $5e-3$. The model is trained using cross entropy loss.\n",
"\n",
"### 5.9 Classification Model Training\n",
"\n",
Expand Down

0 comments on commit 93c71de

Please sign in to comment.