Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Error] get_disparity_predefined_group() raises AttributeError #86

Open
LiFaytheGoblin opened this issue Jul 9, 2020 · 3 comments
Open

Comments

@LiFaytheGoblin
Copy link

I try executing the following code:

bdf = b.get_disparity_predefined_groups(xtab, original_df=df, 
                                        ref_groups_dict={'race':'Caucasian'}, 
                                        alpha=0.05, check_significance=True, 
                                        mask_significance=False)
bdf.style

but it raises an Attribute Error with the following details:

get_disparity_predefined_group()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-22-8a5ae26f1e35> in <module>
      2                                         ref_groups_dict={'race':'Caucasian'},
      3                                         alpha=0.05, check_significance=True,
----> 4                                         mask_significance=False)
      5 bdf.style

C:\Program_Files\Anaconda3\lib\site-packages\aequitas\bias.py in get_disparity_predefined_groups(self, df, original_df, ref_groups_dict, key_columns, input_group_metrics, fill_divbyzero, check_significance, alpha, mask_significance, selected_significance)
    439             self._get_statistical_significance(
    440                 original_df, df, ref_dict=full_ref_dict, score_thresholds=None,
--> 441                 attr_cols=None, alpha=5e-2, selected_significance=selected_significance)
    442 
    443             # if specified, apply T/F mask to significance columns

C:\Program_Files\Anaconda3\lib\site-packages\aequitas\bias.py in _get_statistical_significance(cls, original_df, disparity_df, ref_dict, score_thresholds, attr_cols, alpha, selected_significance)
    745                 for name, func in binary_col_functions.items():
    746                     func = func(thres_unit, 'label_value', thres_val)
--> 747                     original_df.loc[:, name] = original_df.apply(func, axis=1)
    748 
    749         # add columns for error-based significance

C:\Program_Files\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds)
   6485                          args=args,
   6486                          kwds=kwds)
-> 6487         return op.get_result()
   6488 
   6489     def applymap(self, func):

C:\Program_Files\Anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)
    149             return self.apply_raw()
    150 
--> 151         return self.apply_standard()
    152 
    153     def apply_empty_result(self):

C:\Program_Files\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    255 
    256         # compute the result using the series generator
--> 257         self.apply_series_generator()
    258 
    259         # wrap results

C:\Program_Files\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
    284             try:
    285                 for i, v in enumerate(series_gen):
--> 286                     results[i] = self.f(v)
    287                     keys.append(v.name)
    288             except Exception as e:

C:\Program_Files\Anaconda3\lib\site-packages\aequitas\bias.py in <lambda>(x)
    734 
    735         binary_score = lambda rank_col, label_col, thres: lambda x: (
--> 736                 x[rank_col] <= thres).astype(int)
    737 
    738         binary_col_functions = {'binary_score': binary_score,

AttributeError: ("'bool' object has no attribute 'astype'", 'occurred at index 0')

It works if I set check_significance=False.

My data frame:

entity_id        int64
race            object
score          float64
label_value    float64
rank_abs         int32
rank_pct       float64
dtype: object

Any ideas why this is? I have the up to date Aequitas version this time.

@kalikhademi
Copy link

I also have the same problem.
my data is as follows:
score 0/1
label value 0/1
sex object
race object
age_category object

@AndreFCruz
Copy link
Collaborator

AndreFCruz commented Mar 19, 2021

Hi

Can you provide a minimal working example?

The following code runs fine for me on the latest aequitas:

import random
import numpy as np
import pandas as pd

n_samples = 1000

df = pd.DataFrame({
    'label_value': (np.random.random((n_samples,)) > 0.95).astype(int),
    'score': (np.random.random((n_samples,)) > 0.90).astype(int),
    'gender': np.array(['M' if random.random() > 0.5 else 'F' for _ in range(n_samples)]),
    'race': np.array(['Caucasian' if random.random() > 0.2 else 'Non-Caucasian' for _ in range(n_samples)]),
    'age_category': np.array([int(random.random() * 4) for _ in range(n_samples)]).astype(str),
})
df.dtypes

from aequitas.group import Group
from aequitas.bias import Bias

attr_cols = list(set(df.columns) - {
    'entity_id', 'score', 'label_value', 'as_of_date'
})

# Initialize aequitas objects
g = Group()
b = Bias()

# Get confusion matrix and metrics for each individual group and attribute
confusion_matrix_metrics, _ = g.get_crosstabs(
    df, attr_cols=attr_cols,
)


bdf = b.get_disparity_predefined_groups(
    confusion_matrix_metrics, original_df=df, 
    ref_groups_dict={
        'race': 'Caucasian',
        'gender': 'M',
        'age_category': '1',
    }, 
    alpha=0.05, check_significance=True, 
    mask_significance=False,
)
bdf.style

@camyaheltonthomas
Copy link

AttributeError Traceback (most recent call last)
/opt/anaconda3/envs/test/lib/python3.9/site-packages/altair/vegalite/v4/api.py in ?(self, include, exclude)
1647 # see ipython/ipython#11038
1648 try:
1649 dct = self.to_dict()
1650 except Exception:
-> 1651 utils.display_traceback(in_ipython=True)
1652 return {}
1653 else:
1654 return renderers.get()(dct)

/opt/anaconda3/envs/test/lib/python3.9/site-packages/altair/vegalite/v4/api.py in ?(self, *args, **kwargs)
371
372 try:
373 dct = super(TopLevelMixin, copy).to_dict(*args, **kwargs)
374 except jsonschema.ValidationError:
--> 375 dct = None
376
377 # If we hit an error, then re-convert with validate='deep' to get
378 # a more useful traceback. We don't do this by default because it's

/opt/anaconda3/envs/test/lib/python3.9/site-packages/altair/utils/schemapi.py in ?(self, validate, ignore, context)
321
322 if self._args and not self._kwds:
...
6297 ):
6298 return self[name]
-> 6299 return object.getattribute(self, name)

AttributeError: 'Series' object has no attribute 'iteritems'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants