# Dealing with FutureWarning

27 August 2022

The following code triggers a FutureWarning in some versions of Anaconda Jupyter Notebook (namely, Mac).

> `loans_df.groupby('Gender').aggregate(['mean', 'median'])`

> /var/folders/rf/p62x082n5b9cdpt01hwbj0nw0000gn/T/ipykernel_7766/385289080.py:3: FutureWarning: ['Loan_ID', 'Married', 'Dependents', 'Property_Area', 'Loan_Status'] did not aggregate successfully. If any error is raised this will raise in a future version of pandas. Drop these columns/ops to avoid this warning.

The FutureWarning is caused by asking pandas to run a numerical function on non-numerical columns.

Let's see what we can do to avoid it.

Start by importing some data into a dataframe:

In [19]:
import pandas as pd
from io import StringIO

data = StringIO("""Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
LP001002,Male,No,0,Graduate,No,5849,0,,360,1,Urban,Y
LP001003,Male,Yes,1,Graduate,No,4583,1508,128,360,1,Rural,N
LP001005,Male,Yes,0,Graduate,Yes,3000,0,66,360,1,Urban,Y
LP002741,Female,Yes,1,Graduate,No,4608,2845,140,180,1,Semiurban,Y
LP002743,Female,No,0,Graduate,No,2138,0,99,360,0,Semiurban,N
LP002753,Female,No,1,Graduate,,3652,0,95,360,1,Semiurban,Y
LP002755,Male,Yes,1,Not Graduate,No,2239,2524,128,360,1,Urban,Y
LP002757,Female,Yes,0,Not Graduate,No,3017,663,102,360,,Semiurban,Y
""")

df = pd.read_csv(data)
df

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849,0,,360,1.0,Urban,Y
1,LP001003,Male,Yes,1,Graduate,No,4583,1508,128.0,360,1.0,Rural,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000,0,66.0,360,1.0,Urban,Y
3,LP002741,Female,Yes,1,Graduate,No,4608,2845,140.0,180,1.0,Semiurban,Y
4,LP002743,Female,No,0,Graduate,No,2138,0,99.0,360,0.0,Semiurban,N
5,LP002753,Female,No,1,Graduate,,3652,0,95.0,360,1.0,Semiurban,Y
6,LP002755,Male,Yes,1,Not Graduate,No,2239,2524,128.0,360,1.0,Urban,Y
7,LP002757,Female,Yes,0,Not Graduate,No,3017,663,102.0,360,,Semiurban,Y


We can use select_dtypes() to filter out the string columns:

In [20]:
# Select only non-numerical columns by using select_dtypes():
num_only = df.select_dtypes(exclude=['object'])
num_only

Unnamed: 0,Dependents,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History
0,0,5849,0,,360,1.0
1,1,4583,1508,128.0,360,1.0
2,0,3000,0,66.0,360,1.0
3,1,4608,2845,140.0,180,1.0
4,0,2138,0,99.0,360,0.0
5,1,3652,0,95.0,360,1.0
6,1,2239,2524,128.0,360,1.0
7,0,3017,663,102.0,360,


Unfortunately, after we exclude the Gender column, we can no longer do a groupby using Gender:

In [25]:
num_only.groupby('Gender').aggregate(['mean', 'median'])

KeyError: 'Gender'

We will need to concatenate the Gender column with our numerical columns:

In [27]:
# Merge number columns with gender column:
new_df = pd.concat([df['Gender'], num_only], axis=1)
new_df

Unnamed: 0,Gender,Dependents,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History
0,Male,0,5849,0,,360,1.0
1,Male,1,4583,1508,128.0,360,1.0
2,Male,0,3000,0,66.0,360,1.0
3,Female,1,4608,2845,140.0,180,1.0
4,Female,0,2138,0,99.0,360,0.0
5,Female,1,3652,0,95.0,360,1.0
6,Male,1,2239,2524,128.0,360,1.0
7,Female,0,3017,663,102.0,360,


Now we can run the groupby with aggregate:

In [26]:
new_df.groupby('Gender').aggregate(['mean', 'median'])

Unnamed: 0_level_0,Dependents,Dependents,ApplicantIncome,ApplicantIncome,CoapplicantIncome,CoapplicantIncome,LoanAmount,LoanAmount,Loan_Amount_Term,Loan_Amount_Term,Credit_History,Credit_History
Unnamed: 0_level_1,mean,median,mean,median,mean,median,mean,median,mean,median,mean,median
Gender,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
Female,0.5,0.5,3353.75,3334.5,877.0,331.5,109.0,100.5,315.0,360.0,0.666667,1.0
Male,0.5,0.5,3917.75,3791.5,1008.0,754.0,107.333333,128.0,360.0,360.0,1.0,1.0
