-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Added Activation Function and Increased Training Epochs in the CSV Tutorial #2237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Activation Function and Increased Training Epochs in the CSV Tutorial #2237
Conversation
Added non-linear ReLU activation function in order to make the network a non-linear function approximator which is usually a wanted property when using neural networks. The increase in epochs is necessary to see a significant difference between using an activation function and not using one.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
PreviewPreview and run these notebook edits with Google Colab: Rendered notebook diffs available on ReviewNB.com.Format and styleUse the TensorFlow docs notebook tools to format for consistent source diffs and lint for style:$ python3 -m pip install -U --user git+https://github.com/tensorflow/docsIf commits are added to the pull request, synchronize your local branch: git pull origin activation-functions
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks!
But I'd prefer to keep 10 epochs, just to keep things quick. The point is just to show that it works, not to get all the way to convergence.
This PR adds non-linear ReLU activation functions to the CSV tutorial in order to make the networks non-linear. I have noticed that the activation functions are missing as they are set to None when not explicitly passed and also, as to my knowledge, not added anywhere else. This makes the network linear which is probably not what we want in this context as we don't know whether the dataset is linear. Furthermore, stacking several layers does not make much sense in the linear case as it does not add any more capacity to the network.
Using a non-linear neural network also increases the training performance as shown in the evaluation below. However, the difference is only visible if trained for a little longer, this is why I increased the training epochs to 30.
Training on the abalone dataset for 30 epochs over$n=10$ runs without an activation function leads to the following loss (mean $\pm$ std):
$5.10\pm0.04$
Training with the same conditions but with ReLU as activation function:
$4.90\pm0.03$
The training examples later in the tutorial show a similar performance increase when ReLU is used.