-
Notifications
You must be signed in to change notification settings - Fork 81
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
updated kernelspec with script (#1251)
- Loading branch information
1 parent
aa2b3e8
commit 1a7b49f
Showing
19 changed files
with
7,024 additions
and
6,987 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,165 +1,165 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "53329df4", | ||
"metadata": {}, | ||
"source": [ | ||
"***\n", | ||
"**Introduction to Machine Learning** <br>\n", | ||
"__[https://slds-lmu.github.io/i2ml/](https://slds-lmu.github.io/i2ml/)__\n", | ||
"***" | ||
] | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "53329df4", | ||
"metadata": {}, | ||
"source": [ | ||
"***\n", | ||
"**Introduction to Machine Learning** <br>\n", | ||
"__[https://slds-lmu.github.io/i2ml/](https://slds-lmu.github.io/i2ml/)__\n", | ||
"***" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "56e690c1", | ||
"metadata": {}, | ||
"source": [ | ||
"# Exercise sheet 6: Evaluation 2" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "2848e780", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"#| label: import\n", | ||
"# Consider the following libraries for this exercise sheet:\n", | ||
"\n", | ||
"# general\n", | ||
"import numpy as np\n", | ||
"import pandas as pd\n", | ||
"\n", | ||
"# sklearn\n", | ||
"from sklearn.linear_model import LogisticRegression\n", | ||
"from sklearn.preprocessing import LabelEncoder\n", | ||
"from sklearn.model_selection import train_test_split\n", | ||
"from sklearn.model_selection import RepeatedKFold\n", | ||
"from sklearn.model_selection import RepeatedStratifiedKFold" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "bb7dd93d", | ||
"metadata": {}, | ||
"source": [ | ||
"## Exercise 2: Resampling strategies" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5ca487b7", | ||
"metadata": {}, | ||
"source": [ | ||
"> a) Why would we apply resampling rather than a single holdout split?\n", | ||
"\n", | ||
"> **\\# Enter your answer here:**" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "086f9397", | ||
"metadata": {}, | ||
"source": [ | ||
"> b) Classify the `german_credit` data into solvent and insolvent debtors using logistic regression. Compute the\n", | ||
"training error w.r.t. MCE." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a2c7a878", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", | ||
" <b>Hint:</b> Read the already preprocessed file <a href=\"https://github.com/slds-lmu/lecture_i2ml/blob/master/exercises/data/german_credit_for_py.csv\"><code>german_credit_for_py.csv</code></a>.<br>\n", | ||
"</div>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "582a28c4", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Enter your code here:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d7767419", | ||
"metadata": {}, | ||
"source": [ | ||
"> c) In order to evaluate your learner, compare test MCE using\n", | ||
">> (i) three times ten-fold cross validation (3x10-CV) <br>\n", | ||
">> (ii) 10x3-CV <br>\n", | ||
">> (iii) 3x10-CV with stratification for the feature `foreign_worker` to ensure equal representation in all folds <br>\n", | ||
">> (iv) a single holdout split with $90\\%$ training data" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0a9c9f8e", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", | ||
" <b>Hint:</b> you will need <code>RepeatedKFold</code>, <code>RepeatedStratifiedKFold</code>, and <code>train_test_split</code>. <br>\n", | ||
"</div>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"id": "2091102d", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Enter your code here:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9f12230c", | ||
"metadata": {}, | ||
"source": [ | ||
"> d) Discuss and compare your findings from c) and compare them to the training error from b).\n", | ||
"\n", | ||
"> **\\# Enter your answer here:**" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7bf9bb5e", | ||
"metadata": {}, | ||
"source": [ | ||
"> e) Would you consider LOO-CV to be a good alternative?\n", | ||
"\n", | ||
"> **\\# Enter your answer here:**" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python (I2ML)", | ||
"language": "python", | ||
"name": "python-i2ml" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.13" | ||
} | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "56e690c1", | ||
"metadata": {}, | ||
"source": [ | ||
"# Exercise sheet 6: Evaluation 2" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "2848e780", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"#| label: import\n", | ||
"# Consider the following libraries for this exercise sheet:\n", | ||
"\n", | ||
"# general\n", | ||
"import numpy as np\n", | ||
"import pandas as pd\n", | ||
"\n", | ||
"# sklearn\n", | ||
"from sklearn.linear_model import LogisticRegression\n", | ||
"from sklearn.preprocessing import LabelEncoder\n", | ||
"from sklearn.model_selection import train_test_split\n", | ||
"from sklearn.model_selection import RepeatedKFold\n", | ||
"from sklearn.model_selection import RepeatedStratifiedKFold" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "bb7dd93d", | ||
"metadata": {}, | ||
"source": [ | ||
"## Exercise 2: Resampling strategies" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5ca487b7", | ||
"metadata": {}, | ||
"source": [ | ||
"> a) Why would we apply resampling rather than a single holdout split?\n", | ||
"\n", | ||
"> **\\# Enter your answer here:**" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "086f9397", | ||
"metadata": {}, | ||
"source": [ | ||
"> b) Classify the `german_credit` data into solvent and insolvent debtors using logistic regression. Compute the\n", | ||
"training error w.r.t. MCE." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a2c7a878", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", | ||
" <b>Hint:</b> Read the already preprocessed file <a href=\"https://github.com/slds-lmu/lecture_i2ml/blob/master/exercises/data/german_credit_for_py.csv\"><code>german_credit_for_py.csv</code></a>.<br>\n", | ||
"</div>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "582a28c4", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Enter your code here:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d7767419", | ||
"metadata": {}, | ||
"source": [ | ||
"> c) In order to evaluate your learner, compare test MCE using\n", | ||
">> (i) three times ten-fold cross validation (3x10-CV) <br>\n", | ||
">> (ii) 10x3-CV <br>\n", | ||
">> (iii) 3x10-CV with stratification for the feature `foreign_worker` to ensure equal representation in all folds <br>\n", | ||
">> (iv) a single holdout split with $90\\%$ training data" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0a9c9f8e", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", | ||
" <b>Hint:</b> you will need <code>RepeatedKFold</code>, <code>RepeatedStratifiedKFold</code>, and <code>train_test_split</code>. <br>\n", | ||
"</div>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "2091102d", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Enter your code here:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9f12230c", | ||
"metadata": {}, | ||
"source": [ | ||
"> d) Discuss and compare your findings from c) and compare them to the training error from b).\n", | ||
"\n", | ||
"> **\\# Enter your answer here:**" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7bf9bb5e", | ||
"metadata": {}, | ||
"source": [ | ||
"> e) Would you consider LOO-CV to be a good alternative?\n", | ||
"\n", | ||
"> **\\# Enter your answer here:**" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.