You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On average, ~63.2% of the original data points of the original dataset will be present in a given bootstrap sample. The other ~36.8% are repeated samples.
I think the last part is wrong 36.8% are not repeated samples, 36.8% of the original dataset are not in the bootstrap sample. I guess this can be removed since we already say that 63.2% are in the bootstrap sample.
If we want to talk about repeated samples we can say that the bootstrap is the same size of the original dataset and contains only 63.2% of the original dataset, so there will be repeated samples.
https://inria.github.io/scikit-learn-mooc/python_scripts/ensemble_bagging.html
I think the last part is wrong 36.8% are not repeated samples, 36.8% of the original dataset are not in the bootstrap sample. I guess this can be removed since we already say that 63.2% are in the bootstrap sample.
If we want to talk about repeated samples we can say that the bootstrap is the same size of the original dataset and contains only 63.2% of the original dataset, so there will be repeated samples.
I seem to remember mentioning this in the past and indeed I manage to find it: https://github.com/INRIA/scikit-learn-mooc/pull/53/files
The text was updated successfully, but these errors were encountered: