Fixed some typos

bloomberg · Sep 7, 2018 · 7e87282 · 7e87282
1 parent 8950911
commit 7e87282
Showing 1 changed file with 3 additions and 3 deletions.
diff --git a/data/lectures-new.yml b/data/lectures-new.yml
@@ -170,7 +170,7 @@
 -
 #: zero-information prediction function, single-feature prediction functions, regularized linear models, and controversially an \"oracle\" model that is trained on your validation data, to get an idea of the upper bound of performance for your learning method.
   Title: "14. Performance Evaluation"
-  Summary: "<p>This is our second \"black-box\" machine learning lecture. We start by discussing various models that you should almost ways build for your data, to use as baselines and performance sanity checks. From there we focus primarily on evaluating classifier performance.  We define a whole slew of performance statistics used in practice (precision, recall, F1, etc.). We also discuss the fact that most classifiers provide a numeric score, and if you need to make a hard classification, you should tune your threshold to optimize the performance metric of importance to you, rather than just using the default (typically 0 or 0.5). We also discuss the various performance curves you'll see in practice: precision/recall, ROC, and (my personal favorite) lift curves.</p>"
+  Summary: "<p>This is our second \"black-box\" machine learning lecture. We start by discussing various models that you should almost always build for your data, to use as baselines and performance sanity checks. From there we focus primarily on evaluating classifier performance.  We define a whole slew of performance statistics used in practice (precision, recall, F1, etc.). We also discuss the fact that most classifiers provide a numeric score, and if you need to make a hard classification, you should tune your threshold to optimize the performance metric of importance to you, rather than just using the default (typically 0 or 0.5). We also discuss the various performance curves you'll see in practice: precision/recall, ROC, and (my personal favorite) lift curves.</p>"
   Slides: "https://davidrosenberg.github.io/mlcourse/Archive/2017Fall/Lectures/06b.classifier-performance.pdf"
   Video: "https://www.youtube.com/embed/xMyAL0C6cPY"
   References:
@@ -252,7 +252,7 @@
     - "HTF 7.11 (Bootstrap)"
 -
   Title: "22. Bagging and Random Forests"
-  Summary: "<p>We motivate bagging as follows: Consider the regression case, and suppose we could create a bunch (say B) prediction functions based on independent training samples of size n. If we average together these prediction functions, the expected value of the average is the same as any one of the functions, but the variance would have decreased by a factor of 1/B -- a clear win! Of course, this would require an overall sample of size nB. The idea of bagging is to replace independent samples with bootstrap samples from a single data set of size n. Of course, the bootstrap samples are not independent, so much of our discussion is about when bagging does and does not lead to improved performance.  Random forests were invented as a way to create conditions in which bagging works better.</p>
+  Summary: "<p>We motivate bagging as follows: Consider the regression case, and suppose we could create a bunch of prediction functions, say B of them, based on B independent training samples of size n. If we average together these prediction functions, the expected value of the average is the same as any one of the functions, but the variance would have decreased by a factor of 1/B -- a clear win! Of course, this would require an overall sample of size nB. The idea of bagging is to replace independent samples with bootstrap samples from a single data set of size n. Of course, the bootstrap samples are not independent, so much of our discussion is about when bagging does and does not lead to improved performance.  Random forests were invented as a way to create conditions in which bagging works better.</p>
 <details>
   <p><summary>More...</summary>Although it's hard to find crisp theoretical results describing when bagging helps, conventional wisdom says that it helps most for models that are \"high variance\", which in this context means the prediction function may change a lot when you train with a new random sample from the same distribution, and \"low bias\", which basically means fitting the training data well. Large decision trees have these characteristics and are usually the model of choice for bagging. Random forests are just bagged trees with one additional twist: only a random subset of features are considered when splitting a node of a tree. The hope, very roughly speaking, is that by injecting this randomness, the resulting prediction functions are less dependent, and thus we'll get a larger reduction in variance. In practice, random forests are one of the most effective machine learning models in many domains.</p></details>"
   Slides: "https://davidrosenberg.github.io/mlcourse/Archive/2017Fall/Lectures/10c.bagging-random-forests.pdf"
@@ -318,7 +318,7 @@
     - {"An Alternative to EM for GMM [Optional]": "https://arxiv.org/pdf/1706.03267.pdf"}
 -
   Title: "27. EM Algorithm for Latent Variable Models"
-  Summary: "It turns out, fitting a Gaussian mixture model by maximum likelihood is easier said than done: there is no closed from solution, and our usual gradient methods do not work well. The standard approach to maximum likelihood estimation in a Gaussian mixture model is the expectation maximization (EM) algorithm. In this lecture, we present the EM algorithm in the general setting of latent variable models, of which GMM is a special case. We present the EM algorithm as a very basic \"variational method\" and indicate a few generalizations."
+  Summary: "It turns out, fitting a Gaussian mixture model (GMM) by maximum likelihood is easier said than done: there is no closed form solution, and our usual gradient methods do not work well. The standard approach to maximum likelihood estimation in a Gaussian mixture model is the expectation maximization (EM) algorithm. In this lecture, we present the EM algorithm for a general latent variable model, of which GMM is a special case. We present the EM algorithm as a very basic \"variational method\" and indicate a few generalizations."
   Slides: "https://davidrosenberg.github.io/mlcourse/Archive/2017Fall/Lectures/13c.EM-algorithm.pdf"
   Video: "https://www.youtube.com/embed/lMShR1vjbUo"
   References: