Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExternalRiskEstimate seems to be hard coded into HELOC data processing, but I cannot find it. #120

Closed
BrianBlackman opened this issue May 19, 2021 · 5 comments

Comments

@BrianBlackman
Copy link

Screen Shot 2021-05-19 at 11 51 10 AM

If I change the name:


ValueError Traceback (most recent call last)
in
2 from aix360.algorithms.rbm import FeatureBinarizer
3 fb = FeatureBinarizer(negations=True, returnOrd=True)
----> 4 dfTrain, dfTrainStd = fb.fit_transform(dfTrain)
5 dfTest, dfTestStd = fb.transform(dfTest)
6 dfTrain['MostRecentBillAmountRaw'].head()

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
697 if y is None:
698 # fit method of arity 1 (unsupervised transformation)
--> 699 return self.fit(X, **fit_params).transform(X)
700 else:
701 # fit method of arity 2 (supervised transformation)

~/PycharmProjects/AIX360/aix360/algorithms/rbm/features.py in fit(self, X)
111 self.ordinal = ordinal
112 # Fit StandardScaler to ordinal features
--> 113 self.scaler = StandardScaler().fit(data[ordinal])
114 return self
115

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y, sample_weight)
728 # Reset internal state before fitting
729 self._reset()
--> 730 return self.partial_fit(X, y, sample_weight)
731
732 def partial_fit(self, X, y=None, sample_weight=None):

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y, sample_weight)
766 X = self._validate_data(X, accept_sparse=('csr', 'csc'),
767 estimator=self, dtype=FLOAT_DTYPES,
--> 768 force_all_finite='allow-nan', reset=first_call)
769 n_features = X.shape[1]
770

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
419 out = X
420 elif isinstance(y, str) and y == 'no_validation':
--> 421 X = check_array(X, **check_params)
422 out = X
423 else:

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

~/opt/anaconda3/envs/aix360/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
538
539 if all(isinstance(dtype, np.dtype) for dtype in dtypes_orig):
--> 540 dtype_orig = np.result_type(*dtypes_orig)
541
542 if dtype_numeric:

<array_function internals> in result_type(*args, **kwargs)

ValueError: at least one array or dtype is required

@dennislwei
Copy link
Collaborator

Hi @BrianBlackman, you'll have to elaborate. 'ExternalRiskEstimate' is a column in the HELOC dataset. What do you mean by not being able to find it? And what do you mean by changing the name? 'MostRecentBillAmountRaw' is not a column in the dataset.

@BrianBlackman
Copy link
Author

I am trying to run another data set through the notebook example. I can use most of the original headers of my new data set, but some seem to be fixed. 'ExternalRiskEstimate' is one that I seemingly MUST use, even though it is not the target column, 'RiskPerformance'. I understand the need to specify the target column explicitly. My new data doesn't include the external risk estimate or anything like it. 'MostRecentBillAmountRaw' is a column in the new data.

@dennislwei
Copy link
Collaborator

I'm still not sure I understand where your problem is. Regarding the line dfTrain['ExternalRiskEstimate'].head(), you can omit it as it doesn't do anything. It simply prints out the 'ExternalRiskEstimate' columns of the DataFrame to show how it was transformed by FeatureBinarizer.

@BrianBlackman
Copy link
Author

I really appreciate your patience with me. I think this solves the issue I raised. I am sorry I raised it.

@dennislwei
Copy link
Collaborator

No need to be sorry. Please let us know if you do find a problem with FeatureBinarizer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants