Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U26'), dtype('int64')) -> None #19

Closed
SSMK-wq opened this issue Feb 21, 2022 · 3 comments

Comments

@SSMK-wq
Copy link

SSMK-wq commented Feb 21, 2022

thanks a lot for this package. It is very useful for me,

I am trying to follow a tutorial in hackernoon to select features from a dataset

When I execute the below code, I get an error like as shown below

from featurewiz import featurewiz
features, train = featurewiz(ord_train_t,y_train, corr_limit=0.7, verbose=2)

UFuncTypeError: ufunc 'add' did not contain a loop with signature
matching types (dtype('<U26'), dtype('int64')) -> None

However, I verified the dtypes for all my train data (ord_train_t) and target (y_train)

They all are of int64 and float64 types (as shown below) Don't understand why there is still an error. Even after converting float64 to int64, I get the same error. I also tried ord_train_t.isna().sum(), there are no NA's

[![enter image description here][5]][5]

Find below the full error

---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
C:\Users\abcde1\AppData\Local\Temp/ipykernel_1888/1114387036.py in <module>
      1 from featurewiz import featurewiz
      2 
----> 3 features, train = featurewiz(ord_train_t,y_train, corr_limit=0.7, verbose=2)

~\Anaconda3\lib\site-packages\featurewiz\featurewiz.py in featurewiz(dataname, target, corr_limit, verbose, sep, header, test_data, feature_engg, category_encoders, dask_xgboost_flag, nrows, **kwargs)
   1027     ##################    L O A D    T E S T   D A T A      ######################
   1028     dataname = remove_duplicate_cols_in_dataset(dataname)
-> 1029     dataname = remove_special_chars_in_names(dataname, target, verbose=1)
   1030     if dask_xgboost_flag:
   1031         train = remove_special_chars_in_names(train, target)

~\Anaconda3\lib\site-packages\featurewiz\featurewiz.py in remove_special_chars_in_names(df, target, verbose)
   3581     else:
   3582         sel_preds = [x for x in list(df) if x not in target]
-> 3583         df = df[sel_preds+target]
   3584     orig_preds = copy.deepcopy(sel_preds)
   3585     #####   column names must not have any special characters #####

~\Anaconda3\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

~\Anaconda3\lib\site-packages\pandas\core\arraylike.py in __radd__(self, other)
     94     @unpack_zerodim_and_defer("__radd__")
     95     def __radd__(self, other):
---> 96         return self._arith_method(other, roperator.radd)
     97 
     98     @unpack_zerodim_and_defer("__sub__")

~\Anaconda3\lib\site-packages\pandas\core\series.py in _arith_method(self, other, op)
   5524 
   5525         with np.errstate(all="ignore"):
-> 5526             result = ops.arithmetic_op(lvalues, rvalues, op)
   5527 
   5528         return self._construct_result(result, name=res_name)

~\Anaconda3\lib\site-packages\pandas\core\ops\array_ops.py in arithmetic_op(left, right, op)
    222         _bool_arith_check(op, left, right)
    223 
--> 224         res_values = _na_arithmetic_op(left, right, op)
    225 
    226     return res_values

~\Anaconda3\lib\site-packages\pandas\core\ops\array_ops.py in _na_arithmetic_op(left, right, op, is_cmp)
    164 
    165     try:
--> 166         result = func(left, right)
    167     except TypeError:
    168         if is_object_dtype(left) or is_object_dtype(right) and not is_cmp:

~\Anaconda3\lib\site-packages\pandas\core\computation\expressions.py in evaluate(op, a, b, use_numexpr)
    237         if use_numexpr:
    238             # error: "None" not callable
--> 239             return _evaluate(op, op_str, a, b)  # type: ignore[misc]
    240     return _evaluate_standard(op, op_str, a, b)
    241 

~\Anaconda3\lib\site-packages\pandas\core\computation\expressions.py in _evaluate_standard(op, op_str, a, b)
     67     if _TEST_MODE:
     68         _store_test_result(False)
---> 69     return op(a, b)
     70 
     71 

~\Anaconda3\lib\site-packages\pandas\core\roperator.py in radd(left, right)
      7 
      8 def radd(left, right):
----> 9     return right + left
     10 
     11 

UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U26'), dtype('int64')) -> None
@AutoViML
Copy link
Owner

AutoViML commented Feb 21, 2022

You have made an error in your input: your code should be modified as follows:

from featurewiz import featurewiz
features, trainm = featurewiz(train,target, corr_limit=0.7, verbose=2)

The entire train dataframe should be sent in as-is. Then you should send the target column.

Please read the documentation right,
Ram

@SSMK-wq
Copy link
Author

SSMK-wq commented Feb 21, 2022

@AutoViML - Yes, I did send the entire dataframe as is...and sent target column seperately as well...Should I change it to 'trainm' instead of 'train'? Is this what you meant?

@AutoViML
Copy link
Owner

Your mistake was sending the target as a dataframe and not as the name of the column in the code snippet below:

features, trainm = featurewiz(train,target, corr_limit=0.7, verbose=2)

Hope you can see the examples in the examples folder here and use that notebook to form your own pipeline.

https://github.com/AutoViML/featurewiz/tree/main/examples

AutoViML

@SSMK-wq SSMK-wq closed this as completed Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants