Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iteration over a 0-d array in _nanrankdata #16

Closed
nerdcha opened this issue Nov 29, 2016 · 8 comments
Closed

iteration over a 0-d array in _nanrankdata #16

nerdcha opened this issue Nov 29, 2016 · 8 comments

Comments

@nerdcha
Copy link

nerdcha commented Nov 29, 2016

I gather that others have hit this (#12) but it still seems like a live issue, I'm afraid. It's hitting when X and y are ndarrays of what looks like the right shape.

There's a reproducible example on Iris data here.

@bittremieux
Copy link

I can confirm that this remains an issue. When this problem was previously reported (#8, #12) the issues appear to have been closed without any real resolution.

(In my case at least) the problem is caused by the inability to handle nan data as reported in #8. A (local) solution is to change the _nanrankdata method back to bottleneck.nanrankdata.
Commit 80a74c1 explicitly removed the dependency on bottleneck, but this isn't really a solution as the new functionality is broken.

@danielhomola
Copy link
Collaborator

@bittremieux's PR apparently fixed this. Let me know if the issue persist, but for the time being, I'll close this. Thanks again!

@Proteusiq
Copy link

Proteusiq commented Feb 21, 2018

Hello, the problem still remains. Unless I have missed the solution

boruta 0.1.5
Python 3.6 running on Docker


`TypeError                                 Traceback (most recent call last)
<ipython-input-124-be126db958b1> in <module>()
      1 # find all relevant features
----> 2 feat_selector.fit(X, y)

/opt/conda/lib/python3.6/site-packages/boruta/boruta_py.py in fit(self, X, y)
    199         """
    200 
--> 201         return self._fit(X, y)
    202 
    203     def transform(self, X, weak=False):

/opt/conda/lib/python3.6/site-packages/boruta/boruta_py.py in _fit(self, X, y)
    333         imp_history_rejected = imp_history[1:, not_selected] * -1
    334         # calculate ranks in each iteration, then median of ranks across feats
--> 335         iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
    336         rank_medians = np.nanmedian(iter_ranks, axis=0)
    337         ranks = self._nanrankdata(rank_medians, axis=0)

/opt/conda/lib/python3.6/site-packages/boruta/boruta_py.py in _nanrankdata(self, X, axis)
    500         Replaces bottleneck's nanrankdata with scipy and numpy alternative.
    501         """
--> 502         ranks = sp.stats.mstats.rankdata(X, axis=axis)
    503         ranks[np.isnan(X)] = np.nan
    504         return ranks

/opt/conda/lib/python3.6/site-packages/scipy/stats/mstats_basic.py in rankdata(data, axis, use_missing)
    264             return _rank1d(data, use_missing)
    265     else:
--> 266         return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
    267 
    268 

/opt/conda/lib/python3.6/site-packages/numpy/ma/extras.py in apply_along_axis(func1d, axis, arr, *args, **kwargs)
    394     i.put(indlist, ind)
    395     j = i.copy()
--> 396     res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
    397     #  if res is a number, then we have a smaller output array
    398     asscalar = np.isscalar(res)

/opt/conda/lib/python3.6/site-packages/scipy/stats/mstats_basic.py in _rank1d(data, use_missing)
    252 
    253         repeats = find_repeats(data.copy())
--> 254         for r in repeats[0]:
    255             condition = (data == r).filled(False)
    256             rk[condition] = rk[condition].mean()

TypeError: iteration over a 0-d array

`

@chansinhui
Copy link

chansinhui commented Apr 24, 2018

Facing same issues here.

image

@FrancisHChen
Copy link

FrancisHChen commented Sep 18, 2018

Same issues here:

Traceback (most recent call last):
boruta_selector.fit(dataX.values, dataY.values.ravel())
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/boruta/boruta_py.py", line 201, in fit return self._fit(X, y)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/boruta/boruta_py.py", line 335, in _fit iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/boruta/boruta_py.py", line 502, in _nanrankdata
ranks = sp.stats.mstats.rankdata(X, axis=axis)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/scipy/stats/mstats_basic.py", line 265, in rankdata
return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/numpy/ma/extras.py", line 395, in apply_along_axis
res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
File "/spare/local/fchen/virtualenv/lib/python3.5/site-packages/scipy/stats/mstats_basic.py", line 253, in _rank1d
for r in repeats[0]:
TypeError: iteration over a 0-d array

@royalshan
Copy link

what is the solution of this issue? I am still facing the issue

@mejihero
Copy link

same issues:
TypeError Traceback (most recent call last)
in ()
1 feat_selector = BorutaPy(rf, n_estimators = 50, verbose = 2, random_state = 1)
2
----> 3 feat_selector.fit(X, y)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\boruta\boruta_py.py in fit(self, X, y)
199 """
200
--> 201 return self._fit(X, y)
202
203 def transform(self, X, weak=False):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\boruta\boruta_py.py in _fit(self, X, y)
333 imp_history_rejected = imp_history[1:, not_selected] * -1
334 # calculate ranks in each iteration, then median of ranks across feats
--> 335 iter_ranks = self._nanrankdata(imp_history_rejected, axis=1)
336 rank_medians = np.nanmedian(iter_ranks, axis=0)
337 ranks = self._nanrankdata(rank_medians, axis=0)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\boruta\boruta_py.py in _nanrankdata(self, X, axis)
500 Replaces bottleneck's nanrankdata with scipy and numpy alternative.
501 """
--> 502 ranks = sp.stats.mstats.rankdata(X, axis=axis)
503 ranks[np.isnan(X)] = np.nan
504 return ranks

~\AppData\Local\Continuum\anaconda3\lib\site-packages\scipy\stats\mstats_basic.py in rankdata(data, axis, use_missing)
263 return _rank1d(data, use_missing)
264 else:
--> 265 return ma.apply_along_axis(_rank1d,axis,data,use_missing).view(ndarray)
266
267

~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\ma\extras.py in apply_along_axis(func1d, axis, arr, *args, **kwargs)
392 i.put(indlist, ind)
393 j = i.copy()
--> 394 res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
395 # if res is a number, then we have a smaller output array
396 asscalar = np.isscalar(res)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\scipy\stats\mstats_basic.py in _rank1d(data, use_missing)
251
252 repeats = find_repeats(data.copy())
--> 253 for r in repeats[0]:
254 condition = (data == r).filled(False)
255 rk[condition] = rk[condition].mean()

TypeError: iteration over a 0-d array

@elnazsn1988
Copy link

Same error here - any solutions? it seems to be uncomfortable with the different Boruta versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants