Skip to content

Commit

Permalink
Improve notebook wording per #8
Browse files Browse the repository at this point in the history
  • Loading branch information
rhiever committed Aug 25, 2015
1 parent 48e701c commit 586b473
Showing 1 changed file with 6 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -1628,6 +1628,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You'll notice that we used **Stratified *k*-fold cross-validation** in the code above. Stratified *k*-fold keeps the class proportions the same across all of the folds, which is vital for maintaining a representative subset of our data set. (e.g., so we don't have 100% `Iris setosa` entries in one of the folds.)\n",
"\n",
"We can perform 10-fold cross-validation on our model with the following code:"
]
},
Expand Down Expand Up @@ -1976,9 +1978,11 @@
"source": [
"Hmmm... that's a little boring by itself though. How about we compare another classifier to see how they perform?\n",
"\n",
"We already know from previous projects that Random Forest classifiers usually work better than individual decision trees. A common problem that decision trees face is that they're prone to overfitting: They complexify to the point that they classify the training set near-perfectly, but fail to generalize.\n",
"We already know from previous projects that Random Forest classifiers usually work better than individual decision trees. A common problem that decision trees face is that they're prone to overfitting: They complexify to the point that they classify the training set near-perfectly, but fail to generalize to data they have not seen before.\n",
"\n",
"**Random Forests** work around that limitation by creating a whole bunch of decision trees (hence \"forest\") — each trained on a random subset of the features — and have the decision trees work together to make a better classification.\n",
"\n",
"**Random Forests** work around that limitation by creating a whole bunch of decision trees (hence \"forest\") — each with a subset of the features — and have the decision trees work together to make a better classification. Let that be a lesson to you: **Even in Machine Learning, you get better results when you work together!**\n",
"Let that be a lesson to you: **Even in Machine Learning, you get better results when you work together!**\n",
"\n",
"Let's see if a Random Forest classifier works better here.\n",
"\n",
Expand Down

0 comments on commit 586b473

Please sign in to comment.