Skip to content
Discussion options

You must be logged in to vote

For activation functions, ReLU is generally a solid choice for the hidden layers because it helps avoid some common issues during training, like vanishing gradients. For the output of a regression problem, you might not want any activation function at all, or you could use a linear activation to keep the output range flexible and unbounded.

Your initial weights are best when they're small random numbers. Xavier or He initialization are good starting points.

When it comes to training your model, you'll likely use Mean Squared Error (MSE) as your loss function because it's standard for regression problems. Adam is a popular optimizer choice. For the training, I'd say it's a lot of experimen…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Beluker
Comment options

Answer selected by Beluker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Programming Help Discussions around programming languages, open source and software development
2 participants