Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mapclassify.Natural_Break() does not return the specified k classes #16

Open
mingchau opened this Issue Oct 25, 2018 · 1 comment

Comments

Projects
None yet
2 participants
@mingchau
Copy link

commented Oct 25, 2018

Hi,

I use mapclassfiy.Natural_Break() to produce bins for my MapBox heatmap.

My code like this:
df = pd.read_table('./files_output/customer_qty.txt',sep=',',header=None).iloc[:,1]
mapclassify.Natural_Breaks(df.iloc[:,1], k=5)

In my thought it should return 5 classes, but it only returned 3 classes

.
The output is:

Natural_Breaks

Lower Upper Count
=============================================
x[i] <= 1.000 54428
1.000 < x[i] <= 26.000 2475
26.000 < x[i] <= 212.000 66

Attachment is the customer_qty.txt data file.
customer_qty.txt

@weikang9009

This comment has been minimized.

Copy link
Member

commented Oct 25, 2018

Thank you for opening the issue.

I think the problem lies in the kmeans function from scipy used in mapclassify.Natural_Breaks to cluster the input data. This issue is related to issues 1 and 2 opened in stackoverflow. The point is that k-means can fail in the sense that clusters can disappear if no data points are assigned to a cluster center in the iterative process. Therefore, a smarter initial selection of cluster center is important and one such initial smarter selection is implemented in sklearn (init=’k-means++’). I think it makes sense to switch from scipy to sklearn to make sure that the returned number of classification is identical to the number specified in the input @sjsrey @ljwolf ?

sjsrey added a commit to sjsrey/mapclassify that referenced this issue Oct 28, 2018

sjsrey added a commit to sjsrey/mapclassify that referenced this issue Oct 29, 2018

sjsrey added a commit to sjsrey/mapclassify that referenced this issue Oct 29, 2018

sjsrey added a commit to sjsrey/mapclassify that referenced this issue Oct 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.