Skip to content

Windows only: problem with mutual_info_classif when discrete=True #9772

@csalazar94

Description

@csalazar94

Description

RuntimeWarning: invalid value encountered in log
log_outer = -np.log(outer) + log(pi.sum()) + log(pj.sum())

Steps/Code to Reproduce

import pandas as pd
from sklearn.feature_selection import mutual_info_classif

a = [1]*(52632+2529) + [2]*(14660+793) + [3]*(3271+204) + [4]*(814+39) + [5]*(316+20)

b = [0]*52632 + [1]*2529 + [0]*14660 + [1]*793 + [0]*3271 + [1]*204 + [0]*814 + [1]*39 + [0]*316+ [1]*20

df = pd.DataFrame([a,b]).T

mutual_info_classif(df.loc[:,0].values.reshape(-1, 1), df.loc[:,1], discrete_features=True)

Expected Results

array([ 1.48233078])

Actual Results

array([ nan])

Versions

Windows-7-6.1.7601-SP1
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)]
NumPy 1.13.1
SciPy 0.19.1
Scikit-Learn 0.19.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions