Skip to content

Commit

Permalink
fixes for Issues 410, 412 in topic 5, part 1
Browse files Browse the repository at this point in the history
  • Loading branch information
Kashnitskiy, Yury committed Nov 9, 2018
1 parent cde53b9 commit e08d99f
Showing 1 changed file with 20 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -100,20 +100,34 @@
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"sns.set()\n",
"%matplotlib inline\n",
"from matplotlib import pyplot as plt\n",
"\n",
"telecom_data = pd.read_csv('../../data/telecom_churn.csv')\n",
"\n",
"telecom_data.loc[telecom_data['Churn'] == False,\n",
" 'Customer service calls'].plot(kind='kde', label='Loyal')\n",
" 'Customer service calls'].hist(label='Loyal')\n",
"telecom_data.loc[telecom_data['Churn'] == True,\n",
" 'Customer service calls'].plot(kind='kde', label='Churn')\n",
" 'Customer service calls'].hist(label='Churn')\n",
"plt.xlabel('Number of calls')\n",
"plt.ylabel('Density')\n",
"plt.legend();"
Expand All @@ -123,7 +137,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see, loyal customers make fewer calls to customer service than those who eventually left. Now, it might be a good idea to estimate the average number of customer service calls in each group. Since our dataset is small, we would not get a good estimate by simply calculating the mean of the original sample. We will be better off applying the bootstrap method. Let's generate 1000 new bootstrap samples from our original population and produce an interval estimate of the mean."
"Looks like loyal customers make fewer calls to customer service than those who eventually left. Now, it might be a good idea to estimate the average number of customer service calls in each group. Since our dataset is small, we would not get a good estimate by simply calculating the mean of the original sample. We will be better off applying the bootstrap method. Let's generate 1000 new bootstrap samples from our original population and produce an interval estimate of the mean."
]
},
{
Expand Down Expand Up @@ -208,8 +222,8 @@
"\n",
"$$\\large \\E_x\\left[\\left(b_i(x) - y(x)\\right)^{2}\\right] = \\E_x\\left[\\varepsilon_i^{2}(x)\\right].$$\n",
"\n",
"Then, the mean error over all the regression functions will look as follows: \n",
"$$ \\large \\E_1 = \\frac{1}{n} \\E_x\\left[\\varepsilon_i^{2}(x)\\right]$$\n",
"Then, the mean error over all regression functions will look as follows: \n",
"$$ \\large \\E_1 = \\frac{1}{n} \\E_x\\left[ \\sum_i^n \\varepsilon_i^{2}(x)\\right]$$\n",
"\n",
"We'll assume that the errors are unbiased and uncorrelated, that is: \n",
"\n",
Expand Down

0 comments on commit e08d99f

Please sign in to comment.