MinMax Scaler and TypeErrors #1973

amaatouq opened this Issue May 19, 2013 · 6 comments


None yet

2 participants


I am trying to use the MinMax() from sklearn

my code is quite simple
from sklearn.preprocessing import MinMaxScaler
followers = np.array(df['followers_count'].astype('float'))
scaled_followers = scaler.fit(followers)

I get the following error (I experimented with multiple numpy arrays, same problem)

TypeError Traceback (most recent call last)
in ()
2 followers = np.array(df['followers_count'].astype('float'))
3 print followers
----> 4 scaled_followers = scaler.fit(followers)

/home/amaatouq/anaconda/lib/python2.7/site-packages/sklearn/preprocessing.pyc in fit(self, X, y)
195 scale_ = np.max(X, axis=0) - min_
196 # Do not scale constant features
--> 197 scale_[scale_ == 0.0] = 1.0
198 self.scale_ = (feature_range[1] - feature_range[0]) / scale_
199 self.min_ = feature_range[0] - min_ / scale_

TypeError: 'numpy.float64' object does not support item assignment

scikit-learn member

Thanks for the report. That is odd.
What is the shape of followers? And what arguments did you give to MinMaxScaler?\
Can you reproduce with random data?


Thanks for your attention. I am using scikit-learn 0.13.1

followers.shape is (1076817,)

The following code should reproduce the error:

from sklearn.preprocessing import MinMaxScaler
import numpy as np
scaler = MinMaxScaler()
followers = np.array([1,1,2,3,4])
scaled_followers = scaler.fit(followers)

Note: I tried this on Canopy (EPD) and Anaconda with the same error message


Ok it is solved if you add the brackets around the dataframe.

followers = np.array([df.followers_count.astype('float')])

I am new to python and I find this odd, I thought the dataframe columns are numpy array-like

scikit-learn member

The shape doesn't really make sense for scikit-learn data. It should be n_samples, n_features. I guess n_features is one in your case. I think it should still work, though, or at least give a decent error message.
You can also np.vstack the data to make it (n_samples, 1).


I see, I am using sklearn.preprocessing for my normal data analysis needs (not only machine learning), and I am loving it, so really, thanks for your efforts, amazing library.

I know this is probably not related to scikit-learn, but I'd like to know what you think about this problem.
So I'd like to scatter plot two variables in order to visualise the correlation and least-square line fit (for publication purpose). I have 7514022 observations,

My X variable has a mean of 770.7, standard deviation of 24687.7 and min 0 and max 17587402 and my Y has the same problem of very large variance.

I tried to use the z-scores (devising by the std) and MinMax scalers with no luck with the visualisation because of the variance in my dataset. Log scale doesn't work well as I have a lot of 0 values and log-scale doesn't show linear relationships well. I'd like to know what you think

Thanks alot

scikit-learn member


@amueller amueller closed this Jul 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment