Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion preliminaries/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1321,7 +1321,7 @@ slides: true
<img src="00.Preliminaries.key-stage-0097.png" class="slide-image" />

<figcaption>
<p >On the left is a screenshot of one of the most popular online magazines about machine learning. It's called <a href="https://thegradient.pub/"><strong class="blue">The Gradient</strong></a>. This should tell you that the gradient is a<em> very</em> central idea in machine learning.<br></p><p >The reason is the same as before. In machine learning, the main thing we care about is optimization, finding the highest or the lowest point of a complicated function: solving an argmin problem.<br></p><p >The tangent hyperplane is a<em> local approximation</em> of a function. In general it behaves nothing like the function, but in a very small neighborhood where the two just touch, the tangent hyperplane is a great approximation. That means that so long as we stay in that neighborhood, we know where to move it we want the function to increase.<br></p><p >The idea is that we take a small step in that direction, and then recompute the gradient. This gives us a new, slightly different direction to move in, which allows us to take another small step and so on. So long as we take only small steps before recomputing the gradient, we will always be following our function. This is called gradient ascent. If we want to find the minimum of a function, we take the small steps in the opposite direction <br></p><aside >image source: <a href="http://charlesfranzen.com/posts/multiple-regression-in-python-gradient-descent/"><strong class="blue">http://charlesfranzen.com/posts/multiple-regression-in-python-gradient-descent/</strong></a></aside><aside ><a href="http://charlesfranzen.com/posts/multiple-regression-in-python-gradient-descent/"><strong class="blue"></strong></a></aside>
<p >On the left is a screenshot of one of the most popular online magazines about machine learning. It's called <a href="https://thegradient.pub/"><strong class="blue">The Gradient</strong></a>. This should tell you that the gradient is a<em> very</em> central idea in machine learning.<br></p><p >The reason is the same as before. In machine learning, the main thing we care about is optimization, finding the highest or the lowest point of a complicated function: solving an argmin problem.<br></p><p >The tangent hyperplane is a<em> local approximation</em> of a function. In general it behaves nothing like the function, but in a very small neighborhood where the two just touch, the tangent hyperplane is a great approximation. That means that so long as we stay in that neighborhood, we know where to move if we want the function to increase.<br></p><p >The idea is that we take a small step in that direction, and then recompute the gradient. This gives us a new, slightly different direction to move in, which allows us to take another small step and so on. So long as we take only small steps before recomputing the gradient, we will always be following our function. This is called gradient ascent. If we want to find the minimum of a function, we take the small steps in the opposite direction <br></p><aside >image source: <a href="http://charlesfranzen.com/posts/multiple-regression-in-python-gradient-descent/"><strong class="blue">http://charlesfranzen.com/posts/multiple-regression-in-python-gradient-descent/</strong></a></aside><aside ><a href="http://charlesfranzen.com/posts/multiple-regression-in-python-gradient-descent/"><strong class="blue"></strong></a></aside>
</figcaption>
</section>

Expand Down