Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 21 additions & 13 deletions predict-credit-churn/CreditChurn.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -94,18 +94,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First let's use some built in functions from EvalML to convert the data to a woodwork data structure and then cast its dtypes to something we'd rather work with. Then we're going to take a look at some of the unqiue, non-numeric values in the features. Sure enough, `Education_Level`, `Marital_Status`, and `Income_Category` have `Unknown` as a value. This is something we'll have to remember before we get to the model training, since `Unknown` isn't an acceptable value for any of the features."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from evalml.utils.gen_utils import _convert_to_woodwork_structure, _convert_woodwork_types_wrapper\n",
"data = _convert_to_woodwork_structure(data)\n",
"data = _convert_woodwork_types_wrapper(data.to_dataframe())"
"We're going to take a look at some of the unqiue, non-numeric values in the features. Sure enough, `Education_Level`, `Marital_Status`, and `Income_Category` have `Unknown` as a value. This is something we'll have to remember before we get to the model training, since `Unknown` isn't an acceptable value for any of the features."
]
},
{
Expand Down Expand Up @@ -183,7 +172,7 @@
"outputs": [],
"source": [
"X = data.copy()\n",
"data = data.drop(['Credit_Limit'], axis=1)\n",
"X = X.drop(['Credit_Limit'], axis=1)\n",
"y = X.pop('Attrition_Flag')\n",
"\n",
"X['Income_Category'] = X['Income_Category'].replace({'Less than $40K':0,\n",
Expand Down Expand Up @@ -230,6 +219,25 @@
"X = preprocessing(X, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using `infer_feature_types`, we can convert our dataset into a [Woodwork](https://github.com/alteryx/woodwork) data structure, and even [specify what types](https://evalml.alteryx.com/en/stable/user_guide/automl.html) certain features should be. For example, we want to cast `Income_Category` as a categorical type, rather than natural language which is what it was inferred as."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from evalml.utils.gen_utils import infer_feature_types\n",
"X = infer_feature_types(X, feature_types={'Income_Category': 'categorical',\n",
" 'Education_Level': 'categorical'})\n",
"X"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down