New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] Fixed n**2 memory blowup in _labels_inertia_precompute_dense #7721

Merged
merged 4 commits into from Oct 27, 2016

Conversation

Projects
None yet
3 participants
@Erotemic
Contributor

Erotemic commented Oct 21, 2016

What does this implement/fix? Explain your changes.

In k-means and minibatch-k-means, the _labels_inertia_precompute_dense function computes the nearest neighbors of a sample of test data points with the current cluster centers by computing the full distance matrix from every test point to every cluster center. When the number of cluster centers is large and the number of test data points is large this leads to a huge memory blowup. I ran out of memory quite fast on my 64GB memory machine when computing a 65K visual vocabulary using over one million 128-dimensional SIFT descriptors, even when using a modest batch size.

Initially I was going to write a helper function that would batch this operation into a few chunks to avoid having the entire distance matrix represented in memory, but it turns out the function already exists in the sklearn codebase. Yay for reusable code!

I removed the old explicit euclidean_distances function followed by argmin and plugged in the pairwise_distances_argmin_min, which does batching internally. This change dramatically reduces the amount of memory used in the computation.

Any other comments?

Initially I was going to wait to submit this until #7383 was merged and finished, but its such an important change that I wanted to make sure it was in the pipeline. I do want to note that if/when #7383 is merged, check_inputs should be set to False when calling pairwise_distances_argmin_min.

@amueller

I think it would be great if you could do some benchmarks showing this doesn't impact performance. Otherwise it looks good :)

Show outdated Hide outdated sklearn/cluster/k_means_.py
mindist = np.minimum(dist, mindist)
# Breakup nearest neighbor distance computation into batches to prevent
# memory blowup

This comment has been minimized.

@amueller

amueller Oct 24, 2016

Member

Maybe add "in the case of many clusters"?

@amueller

amueller Oct 24, 2016

Member

Maybe add "in the case of many clusters"?

This comment has been minimized.

@Erotemic

Erotemic Oct 24, 2016

Contributor

Fixed. I said in the case of a large number of samples and clusters because the issue is primarily due to assigning labels to every point in a large dataset.

@Erotemic

Erotemic Oct 24, 2016

Contributor

Fixed. I said in the case of a large number of samples and clusters because the issue is primarily due to assigning labels to every point in a large dataset.

Show outdated Hide outdated sklearn/cluster/k_means_.py
# Breakup nearest neighbor distance computation into batches to prevent
# memory blowup
metric_kwargs = dict(squared=True)
# Should use check_inputs=False when speedup_kmpp is merged

This comment has been minimized.

@amueller

amueller Oct 24, 2016

Member

I would rather not mention the other PR here, if you want to do so, please use the PR number. There's no way for someone else to look up the issue based on your branch name (as far as I know)

@amueller

amueller Oct 24, 2016

Member

I would rather not mention the other PR here, if you want to do so, please use the PR number. There's no way for someone else to look up the issue based on your branch name (as far as I know)

This comment has been minimized.

@Erotemic

Erotemic Oct 24, 2016

Contributor

I changed it to a TODO. This code is called every iteration in the mini-batch main loop and for the same reasons as #7383 it would be preferable not to repeat nan checks every time, so I want to make sure that this change is not forgotten.

@Erotemic

Erotemic Oct 24, 2016

Contributor

I changed it to a TODO. This code is called every iteration in the mini-batch main loop and for the same reasons as #7383 it would be preferable not to repeat nan checks every time, so I want to make sure that this change is not forgotten.

Show outdated Hide outdated sklearn/cluster/k_means_.py
# Should use check_inputs=False when speedup_kmpp is merged
# metric_kwargs = dict(squared=True, check_inputs=False)
labels, mindist = pairwise_distances_argmin_min(
X=X, Y=centers, metric='euclidean', metric_kwargs=metric_kwargs)

This comment has been minimized.

@amueller

amueller Oct 24, 2016

Member

Do we want to use a larger batch-size? Is there any impact on the overall runtime using this (also, say, for a low number of clusters?)

@amueller

amueller Oct 24, 2016

Member

Do we want to use a larger batch-size? Is there any impact on the overall runtime using this (also, say, for a low number of clusters?)

This comment has been minimized.

@Erotemic

Erotemic Oct 24, 2016

Contributor

The default batch size for pairwise_distances_argmin_min is 500. On average it seems that pairwise_distances_argmin_min is faster than the original implementation. When I add a batch_size to the varied parameters using 500 performs best.

@Erotemic

Erotemic Oct 24, 2016

Contributor

The default batch size for pairwise_distances_argmin_min is 500. On average it seems that pairwise_distances_argmin_min is faster than the original implementation. When I add a batch_size to the varied parameters using 500 performs best.

@Erotemic

This comment has been minimized.

Show comment
Hide comment
@Erotemic

Erotemic Oct 24, 2016

Contributor

Ignore this post. These results are in error

As requested here are some benchmarks.

Overall it seems that this change improves speed. I'm testing the changed code in a context independent of KMeans/Minibatch KMeans. I'm running each function 10 times and taking the average of the duration.

My first tests is over the basis:

    basis = {
        'n_clusters': [10, 100, 1000, 10000][::-1],
        'n_features': [16, 32, 128][::-1],
        'n_samples': [10, 100, 100000][::-1],
    }

Here is my benchmark script:
https://gist.github.com/Erotemic/e1c8f3d8ada70194e9b4b51f23999ff3

And here is its output.

    n_clusters  n_features  n_samples     MB_new      MB_old  new_speed  old_speed  percent_change  absolute_change
8        10000          16      10000  19.073486  381.469727   0.665101   1.393896       52.284779         0.728796
20        1000          16      10000  19.073486  381.469727   0.675535   1.397402       51.657791         0.721867
32         100          16      10000  19.073486  381.469727   0.685428   1.395160       50.871008         0.709732
44          10          16      10000  19.073486  381.469727   0.758500   1.415223       46.404166         0.656722
28         100          32      10000  19.073486  381.469727   0.685661   1.083227       36.701977         0.397566
4        10000          32      10000  19.073486  381.469727   0.707204   1.098049       35.594480         0.390845
16        1000          32      10000  19.073486  381.469727   0.709260   1.083840       34.560426         0.374580
40          10          32      10000  19.073486  381.469727   0.732543   1.102451       33.553172         0.369907
12        1000         128      10000  19.073486  381.469727   0.966271   1.045146        7.546729         0.078874
24         100         128      10000  19.073486  381.469727   0.966225   1.044865        7.526309         0.078640
0        10000         128      10000  19.073486  381.469727   1.002711   1.049651        4.471936         0.046940
33         100          16       1000   1.907349    3.814697   0.006638   0.020983       68.364786         0.014345
9        10000          16       1000   1.907349    3.814697   0.006622   0.020797       68.158327         0.014175
45          10          16       1000   1.907349    3.814697   0.006999   0.020693       66.179575         0.013695
21        1000          16       1000   1.907349    3.814697   0.006622   0.019960       66.824933         0.013338
36          10         128      10000  19.073486  381.469727   1.035961   1.047755        1.125669         0.011794
17        1000          32       1000   1.907349    3.814697   0.007022   0.018497       62.036160         0.011475
29         100          32       1000   1.907349    3.814697   0.007031   0.016663       57.803568         0.009632
41          10          32       1000   1.907349    3.814697   0.007018   0.016322       57.002813         0.009304
25         100         128       1000   1.907349    3.814697   0.009587   0.018799       49.004558         0.009212
5        10000          32       1000   1.907349    3.814697   0.007026   0.016197       56.619233         0.009170
13        1000         128       1000   1.907349    3.814697   0.009920   0.016950       41.475522         0.007030
1        10000         128       1000   1.907349    3.814697   0.010162   0.017016       40.280847         0.006854
37          10         128       1000   1.907349    3.814697   0.011530   0.016796       31.349553         0.005265
10       10000          16        100   0.038147    0.038147   0.000272   0.000711       61.712542         0.000439
34         100          16        100   0.038147    0.038147   0.000270   0.000705       61.766495         0.000436
22        1000          16        100   0.038147    0.038147   0.000270   0.000705       61.771270         0.000436
46          10          16        100   0.038147    0.038147   0.000270   0.000705       61.640498         0.000434
30         100          32        100   0.038147    0.038147   0.000267   0.000665       59.876012         0.000398
18        1000          32        100   0.038147    0.038147   0.000269   0.000657       59.041782         0.000388
42          10          32        100   0.038147    0.038147   0.000269   0.000657       59.031732         0.000388
6        10000          32        100   0.038147    0.038147   0.000272   0.000657       58.595905         0.000385
14        1000         128        100   0.038147    0.038147   0.000330   0.000683       51.705240         0.000353
38          10         128        100   0.038147    0.038147   0.000546   0.000689       20.791976         0.000143
26         100         128        100   0.038147    0.038147   0.000554   0.000687       19.441457         0.000134
47          10          16         10   0.000381    0.000381   0.000180   0.000214       15.961239         0.000034
11       10000          16         10   0.000381    0.000381   0.000164   0.000181        9.305702         0.000017
35         100          16         10   0.000381    0.000381   0.000162   0.000178        8.921883         0.000016
19        1000          32         10   0.000381    0.000381   0.000163   0.000176        7.405398         0.000013
43          10          32         10   0.000381    0.000381   0.000163   0.000175        7.169093         0.000013
27         100         128         10   0.000381    0.000381   0.000170   0.000182        6.505236         0.000012
23        1000          16         10   0.000381    0.000381   0.000163   0.000174        6.718664         0.000012
39          10         128         10   0.000381    0.000381   0.000170   0.000180        5.639396         0.000010
3        10000         128         10   0.000381    0.000381   0.000170   0.000180        5.509934         0.000010
15        1000         128         10   0.000381    0.000381   0.000170   0.000179        5.286929         0.000009
31         100          32         10   0.000381    0.000381   0.000175   0.000178        1.660863         0.000003
7        10000          32         10   0.000381    0.000381   0.000195   0.000184       -5.963006        -0.000011
2        10000         128        100   0.038147    0.038147   0.000793   0.000683      -16.045375        -0.000110

I also ran another configuration to show the difference for small number of clusters.
I took the average times over a much larger (1000) set of runs in these cases.

The parameter grid basis is:

    basis = {
        'n_clusters': [2, 3, 5, 7, 10][::-1],
        'n_features': [16, 32, 128][::-1],
        'n_samples': [10, 20, 100][::-1],
    }

The results are:

    n_clusters  n_features  n_samples    MB_new    MB_old  new_speed  old_speed  percent_change  absolute_change
30           3          32        100  0.038147  0.038147   0.000275   0.000807       65.899960     5.317609e-04
21           5          32        100  0.038147  0.038147   0.000296   0.000730       59.458865     4.343090e-04
12           7          32        100  0.038147  0.038147   0.000334   0.000755       55.745552     4.207265e-04
36           2         128        100  0.038147  0.038147   0.000479   0.000885       45.830872     4.055369e-04
3           10          32        100  0.038147  0.038147   0.000273   0.000660       58.680926     3.871379e-04
0           10         128        100  0.038147  0.038147   0.000325   0.000688       52.799051     3.631756e-04
39           2          32        100  0.038147  0.038147   0.000399   0.000743       46.318942     3.439658e-04
9            7         128        100  0.038147  0.038147   0.000434   0.000775       43.946440     3.404644e-04
27           3         128        100  0.038147  0.038147   0.000401   0.000677       40.843261     2.765901e-04
15           7          16        100  0.038147  0.038147   0.000178   0.000421       57.671133     2.430553e-04
6           10          16        100  0.038147  0.038147   0.000178   0.000418       57.516469     2.405097e-04
42           2          16        100  0.038147  0.038147   0.000172   0.000412       58.167538     2.394874e-04
33           3          16        100  0.038147  0.038147   0.000172   0.000409       57.840113     2.366464e-04
24           5          16        100  0.038147  0.038147   0.000171   0.000399       57.207460     2.284994e-04
18           5         128        100  0.038147  0.038147   0.000561   0.000759       26.094387     1.980219e-04
13           7          32         20  0.001526  0.001526   0.000107   0.000162       34.144358     5.537009e-05
40           2          32         20  0.001526  0.001526   0.000109   0.000163       32.772404     5.333662e-05
28           3         128         20  0.001526  0.001526   0.000120   0.000172       30.185137     5.201125e-05
22           5          32         20  0.001526  0.001526   0.000108   0.000160       32.561389     5.194640e-05
4           10          32         20  0.001526  0.001526   0.000108   0.000159       32.462215     5.177689e-05
31           3          32         20  0.001526  0.001526   0.000108   0.000158       31.782196     5.026770e-05
10           7         128         20  0.001526  0.001526   0.000118   0.000168       29.859353     5.008483e-05
1           10         128         20  0.001526  0.001526   0.000118   0.000168       29.547085     4.958415e-05
19           5         128         20  0.001526  0.001526   0.000117   0.000166       29.721602     4.946113e-05
37           2         128         20  0.001526  0.001526   0.000126   0.000169       25.168957     4.248643e-05
34           3          16         20  0.001526  0.001526   0.000109   0.000139       21.917839     3.047681e-05
16           7          16         20  0.001526  0.001526   0.000108   0.000135       20.052362     2.709889e-05
7           10          16         20  0.001526  0.001526   0.000110   0.000137       19.434291     2.657032e-05
25           5          16         20  0.001526  0.001526   0.000105   0.000130       19.348354     2.519703e-05
43           2          16         20  0.001526  0.001526   0.000109   0.000133       17.884533     2.384090e-05
38           2         128         10  0.000381  0.000381   0.000109   0.000118        7.440819     8.784533e-06
44           2          16         10  0.000381  0.000381   0.000102   0.000110        6.649526     7.282257e-06
8           10          16         10  0.000381  0.000381   0.000105   0.000110        5.142866     5.676508e-06
14           7          32         10  0.000381  0.000381   0.000102   0.000108        5.010569     5.403042e-06
26           5          16         10  0.000381  0.000381   0.000101   0.000106        4.595667     4.884005e-06
32           3          32         10  0.000381  0.000381   0.000102   0.000106        3.530339     3.749371e-06
11           7         128         10  0.000381  0.000381   0.000108   0.000111        3.253873     3.619194e-06
5           10          32         10  0.000381  0.000381   0.000105   0.000109        3.034985     3.301859e-06
20           5         128         10  0.000381  0.000381   0.000108   0.000109        1.406731     1.538277e-06
23           5          32         10  0.000381  0.000381   0.000108   0.000110        1.394023     1.528502e-06
17           7          16         10  0.000381  0.000381   0.000104   0.000105        1.283200     1.351833e-06
35           3          16         10  0.000381  0.000381   0.000107   0.000108        0.954456     1.029730e-06
2           10         128         10  0.000381  0.000381   0.000109   0.000110        0.889927     9.801388e-07
41           2          32         10  0.000381  0.000381   0.000110   0.000110       -0.009319    -1.025200e-08
29           3         128         10  0.000381  0.000381   0.000112   0.000110       -1.425901    -1.571417e-06

Again, in most cases the speed is improved. In the cases where it is not improved the change is not very significant.

Contributor

Erotemic commented Oct 24, 2016

Ignore this post. These results are in error

As requested here are some benchmarks.

Overall it seems that this change improves speed. I'm testing the changed code in a context independent of KMeans/Minibatch KMeans. I'm running each function 10 times and taking the average of the duration.

My first tests is over the basis:

    basis = {
        'n_clusters': [10, 100, 1000, 10000][::-1],
        'n_features': [16, 32, 128][::-1],
        'n_samples': [10, 100, 100000][::-1],
    }

Here is my benchmark script:
https://gist.github.com/Erotemic/e1c8f3d8ada70194e9b4b51f23999ff3

And here is its output.

    n_clusters  n_features  n_samples     MB_new      MB_old  new_speed  old_speed  percent_change  absolute_change
8        10000          16      10000  19.073486  381.469727   0.665101   1.393896       52.284779         0.728796
20        1000          16      10000  19.073486  381.469727   0.675535   1.397402       51.657791         0.721867
32         100          16      10000  19.073486  381.469727   0.685428   1.395160       50.871008         0.709732
44          10          16      10000  19.073486  381.469727   0.758500   1.415223       46.404166         0.656722
28         100          32      10000  19.073486  381.469727   0.685661   1.083227       36.701977         0.397566
4        10000          32      10000  19.073486  381.469727   0.707204   1.098049       35.594480         0.390845
16        1000          32      10000  19.073486  381.469727   0.709260   1.083840       34.560426         0.374580
40          10          32      10000  19.073486  381.469727   0.732543   1.102451       33.553172         0.369907
12        1000         128      10000  19.073486  381.469727   0.966271   1.045146        7.546729         0.078874
24         100         128      10000  19.073486  381.469727   0.966225   1.044865        7.526309         0.078640
0        10000         128      10000  19.073486  381.469727   1.002711   1.049651        4.471936         0.046940
33         100          16       1000   1.907349    3.814697   0.006638   0.020983       68.364786         0.014345
9        10000          16       1000   1.907349    3.814697   0.006622   0.020797       68.158327         0.014175
45          10          16       1000   1.907349    3.814697   0.006999   0.020693       66.179575         0.013695
21        1000          16       1000   1.907349    3.814697   0.006622   0.019960       66.824933         0.013338
36          10         128      10000  19.073486  381.469727   1.035961   1.047755        1.125669         0.011794
17        1000          32       1000   1.907349    3.814697   0.007022   0.018497       62.036160         0.011475
29         100          32       1000   1.907349    3.814697   0.007031   0.016663       57.803568         0.009632
41          10          32       1000   1.907349    3.814697   0.007018   0.016322       57.002813         0.009304
25         100         128       1000   1.907349    3.814697   0.009587   0.018799       49.004558         0.009212
5        10000          32       1000   1.907349    3.814697   0.007026   0.016197       56.619233         0.009170
13        1000         128       1000   1.907349    3.814697   0.009920   0.016950       41.475522         0.007030
1        10000         128       1000   1.907349    3.814697   0.010162   0.017016       40.280847         0.006854
37          10         128       1000   1.907349    3.814697   0.011530   0.016796       31.349553         0.005265
10       10000          16        100   0.038147    0.038147   0.000272   0.000711       61.712542         0.000439
34         100          16        100   0.038147    0.038147   0.000270   0.000705       61.766495         0.000436
22        1000          16        100   0.038147    0.038147   0.000270   0.000705       61.771270         0.000436
46          10          16        100   0.038147    0.038147   0.000270   0.000705       61.640498         0.000434
30         100          32        100   0.038147    0.038147   0.000267   0.000665       59.876012         0.000398
18        1000          32        100   0.038147    0.038147   0.000269   0.000657       59.041782         0.000388
42          10          32        100   0.038147    0.038147   0.000269   0.000657       59.031732         0.000388
6        10000          32        100   0.038147    0.038147   0.000272   0.000657       58.595905         0.000385
14        1000         128        100   0.038147    0.038147   0.000330   0.000683       51.705240         0.000353
38          10         128        100   0.038147    0.038147   0.000546   0.000689       20.791976         0.000143
26         100         128        100   0.038147    0.038147   0.000554   0.000687       19.441457         0.000134
47          10          16         10   0.000381    0.000381   0.000180   0.000214       15.961239         0.000034
11       10000          16         10   0.000381    0.000381   0.000164   0.000181        9.305702         0.000017
35         100          16         10   0.000381    0.000381   0.000162   0.000178        8.921883         0.000016
19        1000          32         10   0.000381    0.000381   0.000163   0.000176        7.405398         0.000013
43          10          32         10   0.000381    0.000381   0.000163   0.000175        7.169093         0.000013
27         100         128         10   0.000381    0.000381   0.000170   0.000182        6.505236         0.000012
23        1000          16         10   0.000381    0.000381   0.000163   0.000174        6.718664         0.000012
39          10         128         10   0.000381    0.000381   0.000170   0.000180        5.639396         0.000010
3        10000         128         10   0.000381    0.000381   0.000170   0.000180        5.509934         0.000010
15        1000         128         10   0.000381    0.000381   0.000170   0.000179        5.286929         0.000009
31         100          32         10   0.000381    0.000381   0.000175   0.000178        1.660863         0.000003
7        10000          32         10   0.000381    0.000381   0.000195   0.000184       -5.963006        -0.000011
2        10000         128        100   0.038147    0.038147   0.000793   0.000683      -16.045375        -0.000110

I also ran another configuration to show the difference for small number of clusters.
I took the average times over a much larger (1000) set of runs in these cases.

The parameter grid basis is:

    basis = {
        'n_clusters': [2, 3, 5, 7, 10][::-1],
        'n_features': [16, 32, 128][::-1],
        'n_samples': [10, 20, 100][::-1],
    }

The results are:

    n_clusters  n_features  n_samples    MB_new    MB_old  new_speed  old_speed  percent_change  absolute_change
30           3          32        100  0.038147  0.038147   0.000275   0.000807       65.899960     5.317609e-04
21           5          32        100  0.038147  0.038147   0.000296   0.000730       59.458865     4.343090e-04
12           7          32        100  0.038147  0.038147   0.000334   0.000755       55.745552     4.207265e-04
36           2         128        100  0.038147  0.038147   0.000479   0.000885       45.830872     4.055369e-04
3           10          32        100  0.038147  0.038147   0.000273   0.000660       58.680926     3.871379e-04
0           10         128        100  0.038147  0.038147   0.000325   0.000688       52.799051     3.631756e-04
39           2          32        100  0.038147  0.038147   0.000399   0.000743       46.318942     3.439658e-04
9            7         128        100  0.038147  0.038147   0.000434   0.000775       43.946440     3.404644e-04
27           3         128        100  0.038147  0.038147   0.000401   0.000677       40.843261     2.765901e-04
15           7          16        100  0.038147  0.038147   0.000178   0.000421       57.671133     2.430553e-04
6           10          16        100  0.038147  0.038147   0.000178   0.000418       57.516469     2.405097e-04
42           2          16        100  0.038147  0.038147   0.000172   0.000412       58.167538     2.394874e-04
33           3          16        100  0.038147  0.038147   0.000172   0.000409       57.840113     2.366464e-04
24           5          16        100  0.038147  0.038147   0.000171   0.000399       57.207460     2.284994e-04
18           5         128        100  0.038147  0.038147   0.000561   0.000759       26.094387     1.980219e-04
13           7          32         20  0.001526  0.001526   0.000107   0.000162       34.144358     5.537009e-05
40           2          32         20  0.001526  0.001526   0.000109   0.000163       32.772404     5.333662e-05
28           3         128         20  0.001526  0.001526   0.000120   0.000172       30.185137     5.201125e-05
22           5          32         20  0.001526  0.001526   0.000108   0.000160       32.561389     5.194640e-05
4           10          32         20  0.001526  0.001526   0.000108   0.000159       32.462215     5.177689e-05
31           3          32         20  0.001526  0.001526   0.000108   0.000158       31.782196     5.026770e-05
10           7         128         20  0.001526  0.001526   0.000118   0.000168       29.859353     5.008483e-05
1           10         128         20  0.001526  0.001526   0.000118   0.000168       29.547085     4.958415e-05
19           5         128         20  0.001526  0.001526   0.000117   0.000166       29.721602     4.946113e-05
37           2         128         20  0.001526  0.001526   0.000126   0.000169       25.168957     4.248643e-05
34           3          16         20  0.001526  0.001526   0.000109   0.000139       21.917839     3.047681e-05
16           7          16         20  0.001526  0.001526   0.000108   0.000135       20.052362     2.709889e-05
7           10          16         20  0.001526  0.001526   0.000110   0.000137       19.434291     2.657032e-05
25           5          16         20  0.001526  0.001526   0.000105   0.000130       19.348354     2.519703e-05
43           2          16         20  0.001526  0.001526   0.000109   0.000133       17.884533     2.384090e-05
38           2         128         10  0.000381  0.000381   0.000109   0.000118        7.440819     8.784533e-06
44           2          16         10  0.000381  0.000381   0.000102   0.000110        6.649526     7.282257e-06
8           10          16         10  0.000381  0.000381   0.000105   0.000110        5.142866     5.676508e-06
14           7          32         10  0.000381  0.000381   0.000102   0.000108        5.010569     5.403042e-06
26           5          16         10  0.000381  0.000381   0.000101   0.000106        4.595667     4.884005e-06
32           3          32         10  0.000381  0.000381   0.000102   0.000106        3.530339     3.749371e-06
11           7         128         10  0.000381  0.000381   0.000108   0.000111        3.253873     3.619194e-06
5           10          32         10  0.000381  0.000381   0.000105   0.000109        3.034985     3.301859e-06
20           5         128         10  0.000381  0.000381   0.000108   0.000109        1.406731     1.538277e-06
23           5          32         10  0.000381  0.000381   0.000108   0.000110        1.394023     1.528502e-06
17           7          16         10  0.000381  0.000381   0.000104   0.000105        1.283200     1.351833e-06
35           3          16         10  0.000381  0.000381   0.000107   0.000108        0.954456     1.029730e-06
2           10         128         10  0.000381  0.000381   0.000109   0.000110        0.889927     9.801388e-07
41           2          32         10  0.000381  0.000381   0.000110   0.000110       -0.009319    -1.025200e-08
29           3         128         10  0.000381  0.000381   0.000112   0.000110       -1.425901    -1.571417e-06

Again, in most cases the speed is improved. In the cases where it is not improved the change is not very significant.

@Erotemic

This comment has been minimized.

Show comment
Hide comment
@Erotemic

Erotemic Oct 24, 2016

Contributor

Ignore this post. These results are in error

Here is also a set of results when adding batch_size (for the new code only) to the varied parameters

    batch_size  n_clusters  n_features  n_samples     MB_new     MB_old  new_speed  old_speed  percent_change  absolute_change
6          500        1000          32       5000   9.536743  95.367432   0.173258   0.297725       41.806148         0.124467
9          500         100          32       5000   9.536743  95.367432   0.171141   0.290231       41.032762         0.119090
12        1000        1000          32       5000  19.073486  95.367432   0.194437   0.295558       34.213565         0.101121
15        1000         100          32       5000  19.073486  95.367432   0.214538   0.309310       30.639769         0.094772
0          250        1000          32       5000   4.768372  95.367432   0.208617   0.294489       29.159458         0.085871
3          250         100          32       5000   4.768372  95.367432   0.206849   0.291109       28.944449         0.084260
16        1000         100          32       1000   3.814697   3.814697   0.007912   0.017920       55.847721         0.010008
13        1000        1000          32       1000   3.814697   3.814697   0.007709   0.016936       54.481849         0.009227
7          500        1000          32       1000   1.907349   3.814697   0.008295   0.017331       52.137348         0.009036
10         500         100          32       1000   1.907349   3.814697   0.008088   0.016484       50.931234         0.008395
4          250         100          32       1000   0.953674   3.814697   0.009932   0.017049       41.742531         0.007117
1          250        1000          32       1000   0.953674   3.814697   0.010551   0.017178       38.577734         0.006627
14        1000        1000          32        100   0.038147   0.038147   0.000276   0.000731       62.287951         0.000455
8          500        1000          32        100   0.038147   0.038147   0.000272   0.000664       59.014236         0.000392
5          250         100          32        100   0.038147   0.038147   0.000271   0.000657       58.792937         0.000386
11         500         100          32        100   0.038147   0.038147   0.000275   0.000659       58.303206         0.000384
2          250        1000          32        100   0.038147   0.038147   0.000274   0.000655       58.256547         0.000382
17        1000         100          32        100   0.038147   0.038147   0.002593   0.000712     -264.059362        -0.001881
Contributor

Erotemic commented Oct 24, 2016

Ignore this post. These results are in error

Here is also a set of results when adding batch_size (for the new code only) to the varied parameters

    batch_size  n_clusters  n_features  n_samples     MB_new     MB_old  new_speed  old_speed  percent_change  absolute_change
6          500        1000          32       5000   9.536743  95.367432   0.173258   0.297725       41.806148         0.124467
9          500         100          32       5000   9.536743  95.367432   0.171141   0.290231       41.032762         0.119090
12        1000        1000          32       5000  19.073486  95.367432   0.194437   0.295558       34.213565         0.101121
15        1000         100          32       5000  19.073486  95.367432   0.214538   0.309310       30.639769         0.094772
0          250        1000          32       5000   4.768372  95.367432   0.208617   0.294489       29.159458         0.085871
3          250         100          32       5000   4.768372  95.367432   0.206849   0.291109       28.944449         0.084260
16        1000         100          32       1000   3.814697   3.814697   0.007912   0.017920       55.847721         0.010008
13        1000        1000          32       1000   3.814697   3.814697   0.007709   0.016936       54.481849         0.009227
7          500        1000          32       1000   1.907349   3.814697   0.008295   0.017331       52.137348         0.009036
10         500         100          32       1000   1.907349   3.814697   0.008088   0.016484       50.931234         0.008395
4          250         100          32       1000   0.953674   3.814697   0.009932   0.017049       41.742531         0.007117
1          250        1000          32       1000   0.953674   3.814697   0.010551   0.017178       38.577734         0.006627
14        1000        1000          32        100   0.038147   0.038147   0.000276   0.000731       62.287951         0.000455
8          500        1000          32        100   0.038147   0.038147   0.000272   0.000664       59.014236         0.000392
5          250         100          32        100   0.038147   0.038147   0.000271   0.000657       58.792937         0.000386
11         500         100          32        100   0.038147   0.038147   0.000275   0.000659       58.303206         0.000384
2          250        1000          32        100   0.038147   0.038147   0.000274   0.000655       58.256547         0.000382
17        1000         100          32        100   0.038147   0.038147   0.002593   0.000712     -264.059362        -0.001881
@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 24, 2016

Member

Thanks for the extensive benchmarks.
However, shouldn't it be n_clusters here, instead of n_samples?

Member

amueller commented Oct 24, 2016

Thanks for the extensive benchmarks.
However, shouldn't it be n_clusters here, instead of n_samples?

@Erotemic

This comment has been minimized.

Show comment
Hide comment
@Erotemic

Erotemic Oct 24, 2016

Contributor

Yup, that's a mistake. Thanks for catching that.
I've gone ahead and updated the script. The new version can be found here https://gist.github.com/Erotemic/5f9c173ccbdb154d9f49baabba88d80b

This version runs all benchmarks end to end

(venv2) joncrall@hyrule:~/code/scikit-learn$ python benchmark_pairwise_distances_argmin_min.py 
Running small clusters benchmark
Prog   45/45...  rate=32.48 Hz, etr=0:00:04, ellapsed=0:00:17, wall=18:11 EST
====
Results for small clusters benchmark
    n_clusters  n_features  n_samples  niters    MB_new    MB_old  new_speed  old_speed  percent_change  absolute_change
14          10          16         10     100  0.000381  0.000381   0.000100   0.000120       16.718211     2.010584e-05
3           10         128         20     100  0.000763  0.000763   0.000105   0.000123       14.051677     1.721859e-05
9           10          32         10     100  0.000381  0.000381   0.000101   0.000115       11.911104     1.364708e-05
13          10          16         20     100  0.000763  0.000763   0.000099   0.000112       11.166940     1.250505e-05
8           10          32         20     100  0.000763  0.000763   0.000100   0.000110        8.978161     9.889603e-06
12          10          16        100     100  0.003815  0.003815   0.000116   0.000125        7.159296     8.974075e-06
4           10         128         10     100  0.000381  0.000381   0.000110   0.000116        4.846603     5.619526e-06
7           10          32        100     100  0.003815  0.003815   0.000118   0.000121        2.161500     2.608299e-06
24           5          32         10     100  0.000191  0.000191   0.000098   0.000098        0.106783     1.049042e-07
22           5          32        100     100  0.001907  0.001907   0.000110   0.000107       -2.132617    -2.291203e-06
29           5          16         10     100  0.000191  0.000191   0.000099   0.000097       -2.473385    -2.398491e-06
11          10          16       1000     100  0.019073  0.038147   0.000266   0.000263       -1.070453    -2.820492e-06
2           10         128        100     100  0.003815  0.003815   0.000135   0.000132       -2.170733    -2.865791e-06
23           5          32         20     100  0.000381  0.000381   0.000100   0.000096       -5.029423    -4.808903e-06
27           5          16        100     100  0.001907  0.001907   0.000107   0.000101       -5.885962    -5.946159e-06
18           5         128         20     100  0.000381  0.000381   0.000111   0.000102       -8.595192    -8.771420e-06
28           5          16         20     100  0.000381  0.000381   0.000105   0.000096       -9.463918    -9.095669e-06
19           5         128         10     100  0.000191  0.000191   0.000104   0.000095      -10.272626    -9.720325e-06
43           2          16         20     100  0.000153  0.000153   0.000097   0.000086      -13.890824    -1.188517e-05
17           5         128        100     100  0.001907  0.001907   0.000125   0.000113      -10.608850    -1.195192e-05
44           2          16         10     100  0.000076  0.000076   0.000098   0.000085      -14.706129    -1.253366e-05
39           2          32         10     100  0.000076  0.000076   0.000097   0.000084      -16.221602    -1.358509e-05
34           2         128         10     100  0.000076  0.000076   0.000100   0.000087      -15.868074    -1.373053e-05
33           2         128         20     100  0.000153  0.000153   0.000102   0.000087      -17.130931    -1.497030e-05
38           2          32         20     100  0.000153  0.000153   0.000101   0.000084      -19.295368    -1.626968e-05
37           2          32        100     100  0.000763  0.000763   0.000107   0.000091      -18.315568    -1.659155e-05
42           2          16        100     100  0.000763  0.000763   0.000105   0.000088      -19.278678    -1.693726e-05
32           2         128        100     100  0.000763  0.000763   0.000121   0.000100      -20.940161    -2.093315e-05
6           10          32       1000     100  0.019073  0.038147   0.000285   0.000235      -21.397744    -5.023718e-05
26           5          16       1000     100  0.009537  0.019073   0.000234   0.000179      -30.246807    -5.425930e-05
21           5          32       1000     100  0.009537  0.019073   0.000248   0.000177      -40.577754    -7.166862e-05
41           2          16       1000     100  0.003815  0.007629   0.000204   0.000126      -61.895679    -7.814169e-05
36           2          32       1000     100  0.003815  0.007629   0.000226   0.000138      -64.220898    -8.841753e-05
16           5         128       1000     100  0.009537  0.019073   0.000400   0.000274      -45.989207    -1.259732e-04
1           10         128       1000     100  0.019073  0.038147   0.000476   0.000340      -40.007286    -1.361465e-04
31           2         128       1000     100  0.003815  0.007629   0.000369   0.000230      -60.007661    -1.381993e-04
10          10          16      50000     100  0.019073  1.907349   0.009195   0.006062      -51.677877    -3.132861e-03
25           5          16      50000     100  0.009537  0.953674   0.007834   0.003422     -128.910291    -4.411876e-03
40           2          16      50000     100  0.003815  0.381470   0.007010   0.001896     -269.715111    -5.114126e-03
5           10          32      50000     100  0.019073  1.907349   0.011009   0.005662      -94.422493    -5.346384e-03
20           5          32      50000     100  0.009537  0.953674   0.009300   0.003646     -155.083368    -5.654335e-03
35           2          32      50000     100  0.003815  0.381470   0.008362   0.002427     -244.574619    -5.935280e-03
15           5         128      50000     100  0.009537  0.953674   0.017994   0.010256      -75.454489    -7.738459e-03
30           2         128      50000     100  0.003815  0.381470   0.016904   0.008649      -95.432006    -8.254313e-03
0           10         128      50000     100  0.019073  1.907349   0.022081   0.012743      -73.280746    -9.338295e-03
Running large clusters test benchmark
Prog   45/45...  rate=140.64 Hz, etr=0:00:06, ellapsed=0:00:18, wall=18:11 EST
====
Results for large clusters test benchmark
    n_clusters  n_features  n_samples  niters    MB_new      MB_old  new_speed  old_speed  percent_change  absolute_change
10        1000          16      50000       5  1.907349  190.734863   0.261915   0.608303       56.943297         0.346388
5         1000          32      50000       5  1.907349  190.734863   0.274456   0.418578       34.431453         0.144123
11        1000          16      10000       5  1.907349   38.146973   0.050142   0.139836       64.142359         0.089694
6         1000          32      10000       5  1.907349   38.146973   0.071054   0.116036       38.765320         0.044982
1         1000         128      10000       5  1.907349   38.146973   0.070532   0.111245       36.597295         0.040713
20         100          32      50000       5  0.190735   19.073486   0.041903   0.069284       39.519362         0.027380
12        1000          16       1000       5  1.907349    3.814697   0.005146   0.021394       75.945742         0.016248
0         1000         128      50000       5  1.907349  190.734863   0.360974   0.375381        3.837760         0.014406
8         1000          32        100       5  0.381470    0.381470   0.000934   0.013102       92.871570         0.012168
7         1000          32       1000       5  1.907349    3.814697   0.011747   0.022400       47.559727         0.010653
2         1000         128       1000       5  1.907349    3.814697   0.007147   0.016800       57.462322         0.009654
13        1000          16        100       5  0.381470    0.381470   0.000896   0.007979       88.769961         0.007083
14        1000          16         10       5  0.038147    0.038147   0.000336   0.006888       95.125480         0.006552
3         1000         128        100       5  0.381470    0.381470   0.001075   0.007183       85.028179         0.006108
9         1000          32         10       5  0.038147    0.038147   0.000398   0.006494       93.872891         0.006097
4         1000         128         10       5  0.038147    0.038147   0.000616   0.006557       90.605662         0.005941
26         100          16      10000       5  0.190735    3.814697   0.016804   0.019864       15.401991         0.003059
27         100          16       1000       5  0.190735    0.381470   0.000886   0.003427       74.132773         0.002540
22         100          32       1000       5  0.190735    0.381470   0.002837   0.003858       26.462881         0.001021
17         100         128       1000       5  0.190735    0.381470   0.001317   0.002135       38.325106         0.000818
28         100          16        100       5  0.038147    0.038147   0.000257   0.000919       72.093144         0.000663
29         100          16         10       5  0.003815    0.003815   0.000172   0.000755       77.200960         0.000583
24         100          32         10       5  0.003815    0.003815   0.000185   0.000759       75.586146         0.000573
19         100         128         10       5  0.003815    0.003815   0.000205   0.000778       73.683888         0.000573
18         100         128        100       5  0.038147    0.038147   0.000573   0.000973       41.174742         0.000401
38          10          32        100       5  0.003815    0.003815   0.000149   0.000230       35.273405         0.000081
23         100          32        100       5  0.038147    0.038147   0.001285   0.001346        4.534344         0.000061
43          10          16        100       5  0.003815    0.003815   0.000119   0.000138       13.751299         0.000019
44          10          16         10       5  0.000381    0.000381   0.000105   0.000121       13.165156         0.000016
34          10         128         10       5  0.000381    0.000381   0.000110   0.000121        8.807267         0.000011
39          10          32         10       5  0.000381    0.000381   0.000107   0.000117        8.421913         0.000010
33          10         128        100       5  0.003815    0.003815   0.000142   0.000144        1.456471         0.000002
37          10          32       1000       5  0.019073    0.038147   0.000306   0.000260      -17.411205        -0.000045
42          10          16       1000       5  0.019073    0.038147   0.000328   0.000272      -20.646067        -0.000056
32          10         128       1000       5  0.019073    0.038147   0.000581   0.000403      -44.101655        -0.000178
41          10          16      10000       5  0.019073    0.381470   0.002119   0.001480      -43.168202        -0.000639
31          10         128      10000       5  0.019073    0.381470   0.004748   0.004102      -15.749212        -0.000646
16         100         128      10000       5  0.190735    3.814697   0.012713   0.012016       -5.808282        -0.000698
36          10          32      10000       5  0.019073    0.381470   0.003367   0.001894      -77.735545        -0.001473
40          10          16      50000       5  0.019073    1.907349   0.010503   0.008665      -21.214106        -0.001838
25         100          16      50000       5  0.190735   19.073486   0.101533   0.098199       -3.395093        -0.003334
15         100         128      50000       5  0.190735   19.073486   0.062969   0.057410       -9.682521        -0.005559
35          10          32      50000       5  0.019073    1.907349   0.012732   0.006141     -107.318607        -0.006591
30          10         128      50000       5  0.019073    1.907349   0.022822   0.013388      -70.469942        -0.009434
21         100          32      10000       5  0.190735    3.814697   0.035442   0.015652     -126.439576        -0.019790
Running batch_size test benchmark
Prog   12/12...  rate=15.50 Hz, etr=0:00:01, ellapsed=0:00:03, wall=18:12 EST
====
Results for batch_size test benchmark
    batch_size  n_clusters  n_features  n_samples  niters    MB_new     MB_old  new_speed  old_speed  percent_change  absolute_change
8         1000        1000          32      10000       5  3.814697  38.146973   0.055106   0.113864       51.603790         0.058758
4          500        1000          32      10000       5  1.907349  38.146973   0.062720   0.117796       46.755318         0.055076
5          500        1000          32       1000       5  1.907349   3.814697   0.005912   0.020623       71.331690         0.014710
9         1000        1000          32       1000       5  3.814697   3.814697   0.004928   0.017516       71.864927         0.012588
10        1000        1000          32        100       5  0.381470   0.381470   0.000695   0.007067       90.170450         0.006372
6          500        1000          32        100       5  0.381470   0.381470   0.000809   0.007024       88.477883         0.006215
11        1000        1000          32         10       5  0.038147   0.038147   0.000310   0.006187       94.989904         0.005877
7          500        1000          32         10       5  0.038147   0.038147   0.000378   0.006154       93.857991         0.005776
3          100        1000          32         10       5  0.038147   0.038147   0.000863   0.006173       86.022726         0.005310
2          100        1000          32        100       5  0.381470   0.381470   0.001918   0.007026       72.706234         0.005108
1          100        1000          32       1000       5  0.381470   3.814697   0.014100   0.017281       18.406675         0.003181
0          100        1000          32      10000       5  0.381470  38.146973   0.138070   0.114969      -20.094008        -0.023102

The fixed tests show that:

  • speed does take a hit when the number of clusters is small and the number of samples is large.
  • speed improves when both n_clusters and n_samples is large
  • batch size of 1000 vs 500 is a tossup but both are better than 100.
Contributor

Erotemic commented Oct 24, 2016

Yup, that's a mistake. Thanks for catching that.
I've gone ahead and updated the script. The new version can be found here https://gist.github.com/Erotemic/5f9c173ccbdb154d9f49baabba88d80b

This version runs all benchmarks end to end

(venv2) joncrall@hyrule:~/code/scikit-learn$ python benchmark_pairwise_distances_argmin_min.py 
Running small clusters benchmark
Prog   45/45...  rate=32.48 Hz, etr=0:00:04, ellapsed=0:00:17, wall=18:11 EST
====
Results for small clusters benchmark
    n_clusters  n_features  n_samples  niters    MB_new    MB_old  new_speed  old_speed  percent_change  absolute_change
14          10          16         10     100  0.000381  0.000381   0.000100   0.000120       16.718211     2.010584e-05
3           10         128         20     100  0.000763  0.000763   0.000105   0.000123       14.051677     1.721859e-05
9           10          32         10     100  0.000381  0.000381   0.000101   0.000115       11.911104     1.364708e-05
13          10          16         20     100  0.000763  0.000763   0.000099   0.000112       11.166940     1.250505e-05
8           10          32         20     100  0.000763  0.000763   0.000100   0.000110        8.978161     9.889603e-06
12          10          16        100     100  0.003815  0.003815   0.000116   0.000125        7.159296     8.974075e-06
4           10         128         10     100  0.000381  0.000381   0.000110   0.000116        4.846603     5.619526e-06
7           10          32        100     100  0.003815  0.003815   0.000118   0.000121        2.161500     2.608299e-06
24           5          32         10     100  0.000191  0.000191   0.000098   0.000098        0.106783     1.049042e-07
22           5          32        100     100  0.001907  0.001907   0.000110   0.000107       -2.132617    -2.291203e-06
29           5          16         10     100  0.000191  0.000191   0.000099   0.000097       -2.473385    -2.398491e-06
11          10          16       1000     100  0.019073  0.038147   0.000266   0.000263       -1.070453    -2.820492e-06
2           10         128        100     100  0.003815  0.003815   0.000135   0.000132       -2.170733    -2.865791e-06
23           5          32         20     100  0.000381  0.000381   0.000100   0.000096       -5.029423    -4.808903e-06
27           5          16        100     100  0.001907  0.001907   0.000107   0.000101       -5.885962    -5.946159e-06
18           5         128         20     100  0.000381  0.000381   0.000111   0.000102       -8.595192    -8.771420e-06
28           5          16         20     100  0.000381  0.000381   0.000105   0.000096       -9.463918    -9.095669e-06
19           5         128         10     100  0.000191  0.000191   0.000104   0.000095      -10.272626    -9.720325e-06
43           2          16         20     100  0.000153  0.000153   0.000097   0.000086      -13.890824    -1.188517e-05
17           5         128        100     100  0.001907  0.001907   0.000125   0.000113      -10.608850    -1.195192e-05
44           2          16         10     100  0.000076  0.000076   0.000098   0.000085      -14.706129    -1.253366e-05
39           2          32         10     100  0.000076  0.000076   0.000097   0.000084      -16.221602    -1.358509e-05
34           2         128         10     100  0.000076  0.000076   0.000100   0.000087      -15.868074    -1.373053e-05
33           2         128         20     100  0.000153  0.000153   0.000102   0.000087      -17.130931    -1.497030e-05
38           2          32         20     100  0.000153  0.000153   0.000101   0.000084      -19.295368    -1.626968e-05
37           2          32        100     100  0.000763  0.000763   0.000107   0.000091      -18.315568    -1.659155e-05
42           2          16        100     100  0.000763  0.000763   0.000105   0.000088      -19.278678    -1.693726e-05
32           2         128        100     100  0.000763  0.000763   0.000121   0.000100      -20.940161    -2.093315e-05
6           10          32       1000     100  0.019073  0.038147   0.000285   0.000235      -21.397744    -5.023718e-05
26           5          16       1000     100  0.009537  0.019073   0.000234   0.000179      -30.246807    -5.425930e-05
21           5          32       1000     100  0.009537  0.019073   0.000248   0.000177      -40.577754    -7.166862e-05
41           2          16       1000     100  0.003815  0.007629   0.000204   0.000126      -61.895679    -7.814169e-05
36           2          32       1000     100  0.003815  0.007629   0.000226   0.000138      -64.220898    -8.841753e-05
16           5         128       1000     100  0.009537  0.019073   0.000400   0.000274      -45.989207    -1.259732e-04
1           10         128       1000     100  0.019073  0.038147   0.000476   0.000340      -40.007286    -1.361465e-04
31           2         128       1000     100  0.003815  0.007629   0.000369   0.000230      -60.007661    -1.381993e-04
10          10          16      50000     100  0.019073  1.907349   0.009195   0.006062      -51.677877    -3.132861e-03
25           5          16      50000     100  0.009537  0.953674   0.007834   0.003422     -128.910291    -4.411876e-03
40           2          16      50000     100  0.003815  0.381470   0.007010   0.001896     -269.715111    -5.114126e-03
5           10          32      50000     100  0.019073  1.907349   0.011009   0.005662      -94.422493    -5.346384e-03
20           5          32      50000     100  0.009537  0.953674   0.009300   0.003646     -155.083368    -5.654335e-03
35           2          32      50000     100  0.003815  0.381470   0.008362   0.002427     -244.574619    -5.935280e-03
15           5         128      50000     100  0.009537  0.953674   0.017994   0.010256      -75.454489    -7.738459e-03
30           2         128      50000     100  0.003815  0.381470   0.016904   0.008649      -95.432006    -8.254313e-03
0           10         128      50000     100  0.019073  1.907349   0.022081   0.012743      -73.280746    -9.338295e-03
Running large clusters test benchmark
Prog   45/45...  rate=140.64 Hz, etr=0:00:06, ellapsed=0:00:18, wall=18:11 EST
====
Results for large clusters test benchmark
    n_clusters  n_features  n_samples  niters    MB_new      MB_old  new_speed  old_speed  percent_change  absolute_change
10        1000          16      50000       5  1.907349  190.734863   0.261915   0.608303       56.943297         0.346388
5         1000          32      50000       5  1.907349  190.734863   0.274456   0.418578       34.431453         0.144123
11        1000          16      10000       5  1.907349   38.146973   0.050142   0.139836       64.142359         0.089694
6         1000          32      10000       5  1.907349   38.146973   0.071054   0.116036       38.765320         0.044982
1         1000         128      10000       5  1.907349   38.146973   0.070532   0.111245       36.597295         0.040713
20         100          32      50000       5  0.190735   19.073486   0.041903   0.069284       39.519362         0.027380
12        1000          16       1000       5  1.907349    3.814697   0.005146   0.021394       75.945742         0.016248
0         1000         128      50000       5  1.907349  190.734863   0.360974   0.375381        3.837760         0.014406
8         1000          32        100       5  0.381470    0.381470   0.000934   0.013102       92.871570         0.012168
7         1000          32       1000       5  1.907349    3.814697   0.011747   0.022400       47.559727         0.010653
2         1000         128       1000       5  1.907349    3.814697   0.007147   0.016800       57.462322         0.009654
13        1000          16        100       5  0.381470    0.381470   0.000896   0.007979       88.769961         0.007083
14        1000          16         10       5  0.038147    0.038147   0.000336   0.006888       95.125480         0.006552
3         1000         128        100       5  0.381470    0.381470   0.001075   0.007183       85.028179         0.006108
9         1000          32         10       5  0.038147    0.038147   0.000398   0.006494       93.872891         0.006097
4         1000         128         10       5  0.038147    0.038147   0.000616   0.006557       90.605662         0.005941
26         100          16      10000       5  0.190735    3.814697   0.016804   0.019864       15.401991         0.003059
27         100          16       1000       5  0.190735    0.381470   0.000886   0.003427       74.132773         0.002540
22         100          32       1000       5  0.190735    0.381470   0.002837   0.003858       26.462881         0.001021
17         100         128       1000       5  0.190735    0.381470   0.001317   0.002135       38.325106         0.000818
28         100          16        100       5  0.038147    0.038147   0.000257   0.000919       72.093144         0.000663
29         100          16         10       5  0.003815    0.003815   0.000172   0.000755       77.200960         0.000583
24         100          32         10       5  0.003815    0.003815   0.000185   0.000759       75.586146         0.000573
19         100         128         10       5  0.003815    0.003815   0.000205   0.000778       73.683888         0.000573
18         100         128        100       5  0.038147    0.038147   0.000573   0.000973       41.174742         0.000401
38          10          32        100       5  0.003815    0.003815   0.000149   0.000230       35.273405         0.000081
23         100          32        100       5  0.038147    0.038147   0.001285   0.001346        4.534344         0.000061
43          10          16        100       5  0.003815    0.003815   0.000119   0.000138       13.751299         0.000019
44          10          16         10       5  0.000381    0.000381   0.000105   0.000121       13.165156         0.000016
34          10         128         10       5  0.000381    0.000381   0.000110   0.000121        8.807267         0.000011
39          10          32         10       5  0.000381    0.000381   0.000107   0.000117        8.421913         0.000010
33          10         128        100       5  0.003815    0.003815   0.000142   0.000144        1.456471         0.000002
37          10          32       1000       5  0.019073    0.038147   0.000306   0.000260      -17.411205        -0.000045
42          10          16       1000       5  0.019073    0.038147   0.000328   0.000272      -20.646067        -0.000056
32          10         128       1000       5  0.019073    0.038147   0.000581   0.000403      -44.101655        -0.000178
41          10          16      10000       5  0.019073    0.381470   0.002119   0.001480      -43.168202        -0.000639
31          10         128      10000       5  0.019073    0.381470   0.004748   0.004102      -15.749212        -0.000646
16         100         128      10000       5  0.190735    3.814697   0.012713   0.012016       -5.808282        -0.000698
36          10          32      10000       5  0.019073    0.381470   0.003367   0.001894      -77.735545        -0.001473
40          10          16      50000       5  0.019073    1.907349   0.010503   0.008665      -21.214106        -0.001838
25         100          16      50000       5  0.190735   19.073486   0.101533   0.098199       -3.395093        -0.003334
15         100         128      50000       5  0.190735   19.073486   0.062969   0.057410       -9.682521        -0.005559
35          10          32      50000       5  0.019073    1.907349   0.012732   0.006141     -107.318607        -0.006591
30          10         128      50000       5  0.019073    1.907349   0.022822   0.013388      -70.469942        -0.009434
21         100          32      10000       5  0.190735    3.814697   0.035442   0.015652     -126.439576        -0.019790
Running batch_size test benchmark
Prog   12/12...  rate=15.50 Hz, etr=0:00:01, ellapsed=0:00:03, wall=18:12 EST
====
Results for batch_size test benchmark
    batch_size  n_clusters  n_features  n_samples  niters    MB_new     MB_old  new_speed  old_speed  percent_change  absolute_change
8         1000        1000          32      10000       5  3.814697  38.146973   0.055106   0.113864       51.603790         0.058758
4          500        1000          32      10000       5  1.907349  38.146973   0.062720   0.117796       46.755318         0.055076
5          500        1000          32       1000       5  1.907349   3.814697   0.005912   0.020623       71.331690         0.014710
9         1000        1000          32       1000       5  3.814697   3.814697   0.004928   0.017516       71.864927         0.012588
10        1000        1000          32        100       5  0.381470   0.381470   0.000695   0.007067       90.170450         0.006372
6          500        1000          32        100       5  0.381470   0.381470   0.000809   0.007024       88.477883         0.006215
11        1000        1000          32         10       5  0.038147   0.038147   0.000310   0.006187       94.989904         0.005877
7          500        1000          32         10       5  0.038147   0.038147   0.000378   0.006154       93.857991         0.005776
3          100        1000          32         10       5  0.038147   0.038147   0.000863   0.006173       86.022726         0.005310
2          100        1000          32        100       5  0.381470   0.381470   0.001918   0.007026       72.706234         0.005108
1          100        1000          32       1000       5  0.381470   3.814697   0.014100   0.017281       18.406675         0.003181
0          100        1000          32      10000       5  0.381470  38.146973   0.138070   0.114969      -20.094008        -0.023102

The fixed tests show that:

  • speed does take a hit when the number of clusters is small and the number of samples is large.
  • speed improves when both n_clusters and n_samples is large
  • batch size of 1000 vs 500 is a tossup but both are better than 100.
@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 24, 2016

Member

Ok, that is a bit more the outcome I was expecting. Can you check if it has any impact in the context of k-means? I would expect it doesn't.
I'm +1 on the change, and I think we should leave in the 500 to not complicate things.

Member

amueller commented Oct 24, 2016

Ok, that is a bit more the outcome I was expecting. Can you check if it has any impact in the context of k-means? I would expect it doesn't.
I'm +1 on the change, and I think we should leave in the 500 to not complicate things.

@Erotemic

This comment has been minimized.

Show comment
Hide comment
@Erotemic

Erotemic Oct 25, 2016

Contributor

Here are the results for using MiniBatchKMeans

    n_clusters  n_features  n_samples  niters  new_speed  old_speed  percent_change  absolute_change
0         1000          32      10000      10   4.980729   7.192707       30.753068         2.211978
1         1000          32       1000      10   1.608892   1.755806        8.367337         0.146914
2          100          32      10000      10   0.895685   0.978522        8.465502         0.082837
3          100          32       1000      10   0.221525   0.276816       19.974065         0.055291
5           10          32      10000      10   0.073181   0.085164       14.070968         0.011983
9            5          32      10000      10   0.045946   0.057554       20.168327         0.011608
10           5          32       1000      10   0.018519   0.024064       23.043763         0.005545
11           5          32        100      10   0.009777   0.013042       25.037841         0.003266
7           10          32        100      10   0.015932   0.018242       12.661593         0.002310
15           2          32        100      10   0.006658   0.008254       19.342111         0.001596
6           10          32       1000      10   0.031605   0.032837        3.753794         0.001233
16           2          32         10      10   0.005564   0.005948        6.459487         0.000384
4          100          32        100      10   0.185290   0.185462        0.092469         0.000171
12           5          32         10      10   0.009373   0.008518      -10.036275        -0.000855
8           10          32         10      10   0.014827   0.013651       -8.616179        -0.001176
14           2          32       1000      10   0.012516   0.009446      -32.500164        -0.003070
13           2          32      10000      10   0.049772   0.032537      -52.973129        -0.017236

And here is for regular KMeans

    n_clusters  n_features  n_samples  niters  new_speed  old_speed  percent_change  absolute_change
5            5          32       1000      10   0.157053   0.161075        2.497087         0.004022
9            2          32        100      10   0.015005   0.017995       16.616497         0.002990
6            5          32        100      10   0.020179   0.021756        7.250752         0.001577
3           10          32        100      10   0.028281   0.029752        4.943457         0.001471
8            2          32       1000      10   0.107289   0.108030        0.686143         0.000741
10           2          32         10      10   0.007063   0.007394        4.476654         0.000331
7            5          32         10      10   0.012334   0.012548        1.705101         0.000214
4           10          32         10      10   0.021283   0.020065       -6.068077        -0.001218
2           10          32       1000      10   0.178523   0.173668       -2.795264        -0.004854
0          100          32       1000      10   0.600281   0.594048       -1.049186        -0.006233
1          100          32        100      10   0.272203   0.206414      -31.872523        -0.065789

The script I used to generate these is here:
https://gist.github.com/Erotemic/b476854955ca3c3ee892e2f6212cf93e

The change seems to be minimal for KMeans except for the one case of 100 clusters and 100 samples. Running the script again it seems as this timing was an outlier maybe due to some hickup on my computer. Running that single test again I get

   n_clusters  n_features  n_samples  niters  new_speed  old_speed  percent_change  absolute_change
0         100          32        100      10   0.203867   0.213837        4.662239          0.00997

This makes sense because this function is not being extensively used in the KMeans algorithm.
However, in MiniBatchKMeans the change does have an effect that is consistent with the earlier benchmarks. A small number of clusters and a large number of labels is a case where this change has a significant negative impact. However, once the cluster numbers grow just a little the performance of batching starts to kick in.

If this is a problem the underlying implementation of pairwise_distances_argmin_min could be changed to not use batching if batch_size=None. Then in _labels_inertia_precompute_dense we could pass batch_size=None if the number of clusters is less than 5. However, this seems like it might be more work than its worth. Large number of clusters with large numbers of data points is where the efficiency is really needed.

Contributor

Erotemic commented Oct 25, 2016

Here are the results for using MiniBatchKMeans

    n_clusters  n_features  n_samples  niters  new_speed  old_speed  percent_change  absolute_change
0         1000          32      10000      10   4.980729   7.192707       30.753068         2.211978
1         1000          32       1000      10   1.608892   1.755806        8.367337         0.146914
2          100          32      10000      10   0.895685   0.978522        8.465502         0.082837
3          100          32       1000      10   0.221525   0.276816       19.974065         0.055291
5           10          32      10000      10   0.073181   0.085164       14.070968         0.011983
9            5          32      10000      10   0.045946   0.057554       20.168327         0.011608
10           5          32       1000      10   0.018519   0.024064       23.043763         0.005545
11           5          32        100      10   0.009777   0.013042       25.037841         0.003266
7           10          32        100      10   0.015932   0.018242       12.661593         0.002310
15           2          32        100      10   0.006658   0.008254       19.342111         0.001596
6           10          32       1000      10   0.031605   0.032837        3.753794         0.001233
16           2          32         10      10   0.005564   0.005948        6.459487         0.000384
4          100          32        100      10   0.185290   0.185462        0.092469         0.000171
12           5          32         10      10   0.009373   0.008518      -10.036275        -0.000855
8           10          32         10      10   0.014827   0.013651       -8.616179        -0.001176
14           2          32       1000      10   0.012516   0.009446      -32.500164        -0.003070
13           2          32      10000      10   0.049772   0.032537      -52.973129        -0.017236

And here is for regular KMeans

    n_clusters  n_features  n_samples  niters  new_speed  old_speed  percent_change  absolute_change
5            5          32       1000      10   0.157053   0.161075        2.497087         0.004022
9            2          32        100      10   0.015005   0.017995       16.616497         0.002990
6            5          32        100      10   0.020179   0.021756        7.250752         0.001577
3           10          32        100      10   0.028281   0.029752        4.943457         0.001471
8            2          32       1000      10   0.107289   0.108030        0.686143         0.000741
10           2          32         10      10   0.007063   0.007394        4.476654         0.000331
7            5          32         10      10   0.012334   0.012548        1.705101         0.000214
4           10          32         10      10   0.021283   0.020065       -6.068077        -0.001218
2           10          32       1000      10   0.178523   0.173668       -2.795264        -0.004854
0          100          32       1000      10   0.600281   0.594048       -1.049186        -0.006233
1          100          32        100      10   0.272203   0.206414      -31.872523        -0.065789

The script I used to generate these is here:
https://gist.github.com/Erotemic/b476854955ca3c3ee892e2f6212cf93e

The change seems to be minimal for KMeans except for the one case of 100 clusters and 100 samples. Running the script again it seems as this timing was an outlier maybe due to some hickup on my computer. Running that single test again I get

   n_clusters  n_features  n_samples  niters  new_speed  old_speed  percent_change  absolute_change
0         100          32        100      10   0.203867   0.213837        4.662239          0.00997

This makes sense because this function is not being extensively used in the KMeans algorithm.
However, in MiniBatchKMeans the change does have an effect that is consistent with the earlier benchmarks. A small number of clusters and a large number of labels is a case where this change has a significant negative impact. However, once the cluster numbers grow just a little the performance of batching starts to kick in.

If this is a problem the underlying implementation of pairwise_distances_argmin_min could be changed to not use batching if batch_size=None. Then in _labels_inertia_precompute_dense we could pass batch_size=None if the number of clusters is less than 5. However, this seems like it might be more work than its worth. Large number of clusters with large numbers of data points is where the efficiency is really needed.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 26, 2016

Member

I think your current patch makes a reasonable tradeoff.👍

Member

amueller commented Oct 26, 2016

I think your current patch makes a reasonable tradeoff.👍

@amueller amueller changed the title from Fixed n**2 memory blowup in _labels_inertia_precompute_dense to [MRG+ 1] Fixed n**2 memory blowup in _labels_inertia_precompute_dense Oct 26, 2016

@amueller amueller added this to the 0.19 milestone Oct 26, 2016

@jnothman

Otherwise LGTM

Show outdated Hide outdated sklearn/cluster/k_means_.py
# TODO: Once the functionality (PR #7383) is merged use check_inputs=False.
# metric_kwargs = dict(squared=True, check_inputs=False)
labels, mindist = pairwise_distances_argmin_min(
X=X, Y=centers, metric='euclidean', metric_kwargs=metric_kwargs)

This comment has been minimized.

@jnothman

jnothman Oct 26, 2016

Member

Please inline metric_kwargs as

       X=X, Y=centers, metric='euclidean', metric_kwargs={'squared': True})
@jnothman

jnothman Oct 26, 2016

Member

Please inline metric_kwargs as

       X=X, Y=centers, metric='euclidean', metric_kwargs={'squared': True})

@jnothman jnothman changed the title from [MRG+ 1] Fixed n**2 memory blowup in _labels_inertia_precompute_dense to [MRG+2] Fixed n**2 memory blowup in _labels_inertia_precompute_dense Oct 27, 2016

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 27, 2016

Member

LGTM. Please add a what's new / enhancements entry.

Member

jnothman commented Oct 27, 2016

LGTM. Please add a what's new / enhancements entry.

@jnothman jnothman merged commit 061803c into scikit-learn:master Oct 27, 2016

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 27, 2016

Member

Thanks!

Member

jnothman commented Oct 27, 2016

Thanks!

espg added a commit to espg/scikit-learn that referenced this pull request Oct 28, 2016

amueller added a commit to amueller/scikit-learn that referenced this pull request Nov 9, 2016

sergeyf added a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017

afiodorov added a commit to unravelin/scikit-learn that referenced this pull request Apr 25, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment