From 302bd21d7cff0c280624ab6327878f57aa78386c Mon Sep 17 00:00:00 2001 From: forsythd Date: Wed, 29 Jul 2015 15:02:00 -0400 Subject: [PATCH] Fixed Typo --- neural-networks-3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/neural-networks-3.md b/neural-networks-3.md index dbde2a04..86245513 100644 --- a/neural-networks-3.md +++ b/neural-networks-3.md @@ -325,7 +325,7 @@ As we've seen, training Neural Networks can involve many hyperparameter settings - learning rate decay schedule (such as the decay constant) - regularization strength (L2 penalty, dropout strength) -But as saw, there are many more relatively less sensitive hyperparameters, for example in per-parameter adaptive learning methods, the setting of momentum and its schedule, etc. In this section we describe some additional tips and tricks for performing the hyperparameter search: +But as we saw, there are many more relatively less sensitive hyperparameters, for example in per-parameter adaptive learning methods, the setting of momentum and its schedule, etc. In this section we describe some additional tips and tricks for performing the hyperparameter search: **Implementation**. Larger Neural Networks typically require a long time to train, so performing hyperparameter search can take many days/weeks. It is important to keep this in mind since it influences the design of your code base. One particular design is to have a **worker** that continuously samples random hyperparameters and performs the optimization. During the training, the worker will keep track of the validation performance after every epoch, and writes a model checkpoint (together with miscellaneous training statistics such as the loss over time) to a file, preferably on a shared file system. It is useful to include the validation performance directly in the filename, so that it is simple to inspect and sort the progress. Then there is a second program which we will call a **master**, which launches or kills workers across a computing cluster, and may additionally inspect the checkpoints written by workers and plot their training statistics, etc.