Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KPrototypes unhashable type: 'slice' #67

Closed
paulaceccon opened this issue Apr 3, 2018 · 8 comments
Closed

KPrototypes unhashable type: 'slice' #67

paulaceccon opened this issue Apr 3, 2018 · 8 comments

Comments

@paulaceccon
Copy link

As far as I got using this method I could pass a list containing the positions of the categorial variables.

categorical = r1.select_dtypes(include=[object]).columns
range_cat = [r1.columns.get_loc(c) for c in categorical]
range_cat
[0, 10, 11, 13, 14, 15]

But this doesn't work:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-8fe6545f1e6d> in <module>()
      3 for k in range(1, 11):
      4     kp = KPrototypes(n_clusters=k, init='Huang', verbose=2)
----> 5     kp.fit(r1, categorical=range_cat)
      6     wcss.append(kp.cost_)
      7 

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\kmodes\kprototypes.py in fit(self, X, y, categorical)
    413                                                     self.init,
    414                                                     self.n_init,
--> 415                                                     self.verbose)
    416         return self
    417 

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\kmodes\kprototypes.py in k_prototypes(X, categorical, n_clusters, max_iter, num_dissim, cat_dissim, gamma, init, n_init, verbose)
    152     assert n_clusters <= npoints, "More clusters than data points?"
    153 
--> 154     Xnum, Xcat = _split_num_cat(X, categorical)
    155     Xnum, Xcat = check_array(Xnum), check_array(Xcat, dtype=None)
    156 

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\kmodes\kprototypes.py in _split_num_cat(X, categorical)
     44     :param categorical: Indices of categorical columns
     45     """
---> 46     Xnum = np.asanyarray(X[:, [ii for ii in range(X.shape[1])
     47                                if ii not in categorical]]).astype(np.float64)
     48     Xcat = np.asanyarray(X[:, categorical])

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2137             return self._getitem_multilevel(key)
   2138         else:
-> 2139             return self._getitem_column(key)
   2140 
   2141     def _getitem_column(self, key):

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   2144         # get column
   2145         if self.columns.is_unique:
-> 2146             return self._get_item_cache(key)
   2147 
   2148         # duplicate columns & possible reduce dimensionality

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1838         """Return the cached item, item represents a label indexer."""
   1839         cache = self._item_cache
-> 1840         res = cache.get(item)
   1841         if res is None:
   1842             values = self._data.get(item)

TypeError: unhashable type: 'slice'
@katiyarsahil
Copy link

Were you able to solve this issue, I am facing the same error.

@nicodv
Copy link
Owner

nicodv commented Sep 14, 2018

Instead of giving kmodes a pandas DataFrame, try giving it a numpy array using df.values. Let me know if that helps.

@nicodv nicodv reopened this Sep 14, 2018
@katiyarsahil
Copy link

That worked. I tried it reading your comment on 'TypeError: unhashable type #40'. Solved the issue. Thanks a lot.

@nicodv
Copy link
Owner

nicodv commented Sep 14, 2018

@katiyarsahil , can you do me a favor and do the following on your original pandas object?

str(df.__class__)

And tell us what the output of that is?

It seems like this is still a bug, as the pandas object is not converted to a numpy array.

@katiyarsahil
Copy link

katiyarsahil commented Sep 14, 2018 via email

@katiyarsahil
Copy link

"<class 'pandas.core.frame.DataFrame'>". This is the output i get. I had to convert this dataframe to an array to sucessfully run the fit_predict command.

@nicodv
Copy link
Owner

nicodv commented Oct 1, 2018

What version of kmodes are you using, @katiyarsahil ?

I'm puzzled why this line, introduced in v0.8, does not solve this issue: https://github.com/nicodv/kmodes/blob/0.8/kmodes/kprototypes.py#L134

@nicodv
Copy link
Owner

nicodv commented Jan 10, 2019

The problem was not with the data, but with the categorical argument. pandas' get_loc operation, for example, will return a slice on the DataFrame instead of a list of indices (as kprototypes expects).

I am now checking the data type of the categorical argument, and leave it up to the users to comply:

if categorical is not None:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants