Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch k means #2

Merged
merged 6 commits into from May 17, 2011
Merged

Batch k means #2

merged 6 commits into from May 17, 2011

Conversation

GaelVaroquaux
Copy link

A few fixes and enhancements.

import numpy as np
import pylab as pl
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why use pylab instead of matplotlib?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its the same thing (pylab is a helper module shipped by matplotlib). The reason I changed this line is for consistency.

@NelleV NelleV merged commit 96d47af into NelleV:batchKMeans May 17, 2011
NelleV pushed a commit that referenced this pull request Oct 7, 2011
NelleV pushed a commit that referenced this pull request Jan 9, 2012
Update random forest example to demo n_jobs=2
NelleV pushed a commit that referenced this pull request Mar 16, 2012
NelleV pushed a commit that referenced this pull request Feb 21, 2017
…scikit-learn#7838)

* initial commit for return_std

* initial commit for return_std

* adding tests, examples, ARD predict_std

* adding tests, examples, ARD predict_std

* a smidge more documentation

* a smidge more documentation

* Missed a few PEP8 issues

* Changing predict_std to return_std #1

* Changing predict_std to return_std #2

* Changing predict_std to return_std #3

* Changing predict_std to return_std final

* adding better plots via polynomial regression

* trying to fix flake error

* fix to ARD plotting issue

* fixing some flakes

* Two blank lines part 1

* Two blank lines part 2

* More newlines!

* Even more newlines

* adding info to the doc string for the two plot files

* Rephrasing "polynomial" for Bayesian Ridge Regression

* Updating "polynomia" for ARD

* Adding more formal references

* Another asked-for improvement to doc string.

* Fixing flake8 errors

* Cleaning up the tests a smidge.

* A few more flakes

* requested fixes from Andy

* Mini bug fix

* Final pep8 fix

* pep8 fix round 2

* Fix beta_ to alpha_ in the comments
NelleV pushed a commit that referenced this pull request Aug 11, 2017
…scikit-learn#7838)

* initial commit for return_std

* initial commit for return_std

* adding tests, examples, ARD predict_std

* adding tests, examples, ARD predict_std

* a smidge more documentation

* a smidge more documentation

* Missed a few PEP8 issues

* Changing predict_std to return_std #1

* Changing predict_std to return_std #2

* Changing predict_std to return_std #3

* Changing predict_std to return_std final

* adding better plots via polynomial regression

* trying to fix flake error

* fix to ARD plotting issue

* fixing some flakes

* Two blank lines part 1

* Two blank lines part 2

* More newlines!

* Even more newlines

* adding info to the doc string for the two plot files

* Rephrasing "polynomial" for Bayesian Ridge Regression

* Updating "polynomia" for ARD

* Adding more formal references

* Another asked-for improvement to doc string.

* Fixing flake8 errors

* Cleaning up the tests a smidge.

* A few more flakes

* requested fixes from Andy

* Mini bug fix

* Final pep8 fix

* pep8 fix round 2

* Fix beta_ to alpha_ in the comments
NelleV pushed a commit that referenced this pull request Aug 11, 2017
* resurrect quantile scaler

* move the code in the pre-processing module

* first draft

* Add tests.

* Fix bug in QuantileNormalizer.

* Add quantile_normalizer.

* Implement pickling

* create a specific function for dense transform

* Create a fit function for the dense case

* Create a toy examples

* First draft with sparse matrices

* remove useless functions and non-negative sparse compatibility

* fix slice call

* Fix tests of QuantileNormalizer.

* Fix estimator compatibility

* List of functions became tuple of functions
* Check X consistency at transform and inverse transform time

* fix doc

* Add negative ValueError tests for QuantileNormalizer.

* Fix cosmetics

* Fix compatibility numpy <= 1.8

* Add n_features tests and correct ValueError.

* PEP8

* fix fill_value for early scipy compatibility

* simplify sampling

* Fix tests.

* removing last pring

* Change choice for permutation

* cosmetics

* fix remove remaining choice

* DOC

* Fix inconsistencies

* pep8

* Add checker for init parameters.

* hack bounds and make a test

* FIX/TST bounds are provided by the fitting and not X at transform

* PEP8

* FIX/TST axis should be <= 1

* PEP8

* ENH Add parameter ignore_implicit_zeros

* ENH match output distribution

* ENH clip the data to avoid infinity due to output PDF

* FIX ENH restraint to uniform and norm

* [MRG] ENH Add example comparing the distribution of all scaling preprocessor (#2)

* ENH Add example comparing the distribution of all scaling preprocessor

* Remove Jupyter notebook convert

* FIX/ENH Select feat before not after; Plot interquantile data range for all

* Add heatmap legend

* Remove comment maybe?

* Move doc from robust_scaling to plot_all_scaling; Need to update doc

* Update the doc

* Better aesthetics; Better spacing and plot colormap only at end

* Shameless author re-ordering ;P

* Use env python for she-bang

* TST Validity of output_pdf

* EXA Use OrderedDict; Make it easier to add more transformations

* FIX PEP8 and replace scipy.stats by str in example

* FIX remove useless import

* COSMET change variable names

* FIX change output_pdf occurence to output_distribution

* FIX partial fixies from comments

* COMIT change class name and code structure

* COSMIT change direction to inverse

* FIX factorize transform in _transform_col

* PEP8

* FIX change the magic 10

* FIX add interp1d to fixes

* FIX/TST allow negative entries when ignore_implicit_zeros is True

* FIX use np.interp instead of sp.interpolate.interp1d

* FIX/TST fix tests

* DOC start checking doc

* TST add test to check the behaviour of interp numpy

* TST/EHN Add the possibility to add noise to compute quantile

* FIX factorize quantile computation

* FIX fixes issues

* PEP8

* FIX/DOC correct doc

* TST/DOC improve doc and add random state

* EXA add examples to illustrate the use of smoothing_noise

* FIX/DOC fix some grammar

* DOC fix example

* DOC/EXA make plot titles more succint

* EXA improve explanation

* EXA improve the docstring

* DOC add a bit more documentation

* FIX advance review

* TST add subsampling test

* DOC/TST better example for the docstring

* DOC add ellipsis to docstring

* FIX address olivier comments

* FIX remove random_state in sparse.rand

* FIX spelling doc

* FIX cite example in user guide and docstring

* FIX olivier comments

* EHN improve the example comparing all the pre-processing methods

* FIX/DOC remove title

* FIX change the scaling of the figure

* FIX plotting layout

* FIX ratio w/h

* Reorder and reword the plot_all_scaling example

* Fix aspect ratio and better explanations in the plot_all_scaling.py example

* Fix broken link and remove useless sentence

* FIX fix couples of spelling

* FIX comments joel

* FIX/DOC address documentation comments

* FIX address comments joel

* FIX inline sparse and dense transform

* PEP8

* TST/DOC temporary skipping test

* FIX raise an error if n_quantiles > subsample

* FIX wording in smoothing_noise example

* EXA Denis comments

* FIX rephrasing

* FIX make smoothing_noise to be a boolearn and change doc

* FIX address comments

* FIX verbose the doc slightly more

* PEP8/DOC

* ENH: 2-ways interpolation to avoid smoothing_noise

Simplifies also the code, examples, and documentation
NelleV pushed a commit that referenced this pull request Aug 11, 2017
* add test for _preprocess_data and make it consistent

* fix pep8

* add doc, cast systematically y in X.dtype and update test_coordinate_descent.py

* test if input values don't change with copy=True

* test if input values don't change with copy=True #2

* fix doc

* fix doc #2

* fix doc #3
NelleV pushed a commit that referenced this pull request Aug 20, 2017
* resurrect quantile scaler

* move the code in the pre-processing module

* first draft

* Add tests.

* Fix bug in QuantileNormalizer.

* Add quantile_normalizer.

* Implement pickling

* create a specific function for dense transform

* Create a fit function for the dense case

* Create a toy examples

* First draft with sparse matrices

* remove useless functions and non-negative sparse compatibility

* fix slice call

* Fix tests of QuantileNormalizer.

* Fix estimator compatibility

* List of functions became tuple of functions
* Check X consistency at transform and inverse transform time

* fix doc

* Add negative ValueError tests for QuantileNormalizer.

* Fix cosmetics

* Fix compatibility numpy <= 1.8

* Add n_features tests and correct ValueError.

* PEP8

* fix fill_value for early scipy compatibility

* simplify sampling

* Fix tests.

* removing last pring

* Change choice for permutation

* cosmetics

* fix remove remaining choice

* DOC

* Fix inconsistencies

* pep8

* Add checker for init parameters.

* hack bounds and make a test

* FIX/TST bounds are provided by the fitting and not X at transform

* PEP8

* FIX/TST axis should be <= 1

* PEP8

* ENH Add parameter ignore_implicit_zeros

* ENH match output distribution

* ENH clip the data to avoid infinity due to output PDF

* FIX ENH restraint to uniform and norm

* [MRG] ENH Add example comparing the distribution of all scaling preprocessor (#2)

* ENH Add example comparing the distribution of all scaling preprocessor

* Remove Jupyter notebook convert

* FIX/ENH Select feat before not after; Plot interquantile data range for all

* Add heatmap legend

* Remove comment maybe?

* Move doc from robust_scaling to plot_all_scaling; Need to update doc

* Update the doc

* Better aesthetics; Better spacing and plot colormap only at end

* Shameless author re-ordering ;P

* Use env python for she-bang

* TST Validity of output_pdf

* EXA Use OrderedDict; Make it easier to add more transformations

* FIX PEP8 and replace scipy.stats by str in example

* FIX remove useless import

* COSMET change variable names

* FIX change output_pdf occurence to output_distribution

* FIX partial fixies from comments

* COMIT change class name and code structure

* COSMIT change direction to inverse

* FIX factorize transform in _transform_col

* PEP8

* FIX change the magic 10

* FIX add interp1d to fixes

* FIX/TST allow negative entries when ignore_implicit_zeros is True

* FIX use np.interp instead of sp.interpolate.interp1d

* FIX/TST fix tests

* DOC start checking doc

* TST add test to check the behaviour of interp numpy

* TST/EHN Add the possibility to add noise to compute quantile

* FIX factorize quantile computation

* FIX fixes issues

* PEP8

* FIX/DOC correct doc

* TST/DOC improve doc and add random state

* EXA add examples to illustrate the use of smoothing_noise

* FIX/DOC fix some grammar

* DOC fix example

* DOC/EXA make plot titles more succint

* EXA improve explanation

* EXA improve the docstring

* DOC add a bit more documentation

* FIX advance review

* TST add subsampling test

* DOC/TST better example for the docstring

* DOC add ellipsis to docstring

* FIX address olivier comments

* FIX remove random_state in sparse.rand

* FIX spelling doc

* FIX cite example in user guide and docstring

* FIX olivier comments

* EHN improve the example comparing all the pre-processing methods

* FIX/DOC remove title

* FIX change the scaling of the figure

* FIX plotting layout

* FIX ratio w/h

* Reorder and reword the plot_all_scaling example

* Fix aspect ratio and better explanations in the plot_all_scaling.py example

* Fix broken link and remove useless sentence

* FIX fix couples of spelling

* FIX comments joel

* FIX/DOC address documentation comments

* FIX address comments joel

* FIX inline sparse and dense transform

* PEP8

* TST/DOC temporary skipping test

* FIX raise an error if n_quantiles > subsample

* FIX wording in smoothing_noise example

* EXA Denis comments

* FIX rephrasing

* FIX make smoothing_noise to be a boolearn and change doc

* FIX address comments

* FIX verbose the doc slightly more

* PEP8/DOC

* ENH: 2-ways interpolation to avoid smoothing_noise

Simplifies also the code, examples, and documentation
NelleV pushed a commit that referenced this pull request Aug 20, 2017
* add test for _preprocess_data and make it consistent

* fix pep8

* add doc, cast systematically y in X.dtype and update test_coordinate_descent.py

* test if input values don't change with copy=True

* test if input values don't change with copy=True #2

* fix doc

* fix doc #2

* fix doc #3
NelleV pushed a commit that referenced this pull request May 28, 2018
initial PR commit

seq_dataset.pyx generated from template

seq_dataset.pyx generated from template #2

rename variables

fused types consistency test for seq_dataset

a

sklearn/utils/tests/test_seq_dataset.py

new if statement

add doc

sklearn/utils/seq_dataset.pyx.tp

minor changes

minor changes

typo fix

check numeric accuracy only up 5th decimal

Address oliver's request for changing test name

add test for make_dataset and rename a variable in test_seq_dataset
NelleV pushed a commit that referenced this pull request Jul 8, 2018
initial PR commit

seq_dataset.pyx generated from template

seq_dataset.pyx generated from template #2

rename variables

fused types consistency test for seq_dataset

a

sklearn/utils/tests/test_seq_dataset.py

new if statement

add doc

sklearn/utils/seq_dataset.pyx.tp

minor changes

minor changes

typo fix

check numeric accuracy only up 5th decimal

Address oliver's request for changing test name

add test for make_dataset and rename a variable in test_seq_dataset
glemaitre pushed a commit that referenced this pull request Jul 17, 2018
…y calculation (scikit-learn#11464)

* Fix to allow M

* Updated MAE test to consider sample_weights in calculation

* Removed comment

* Fixed: E501 line too long (82 > 79 characters)

* syntax correction

* Added fix details

* Changed to use consistent datatypes during calculaions

* Corrected formatting

* Requested Changes

* removed explicit casts

* Removed unnecessary explicits

* Removed unnecessary explicit casts

* added additional test

* updated comments

* Requested changes incl additional unit test

* fix mistake

* formatting

* removed whitespace

* added test notes

* formatting

* Requested changes

* Trailing space fix attempt

* Trailing whitespace fix attempt #2

* Remove trailing whitespace
glemaitre pushed a commit that referenced this pull request Jul 17, 2018
* Add averaging option to AMI and NMI

Leave current behavior unchanged

* Flake8 fixes

* Incorporate tests of means for AMI and NMI

* Add note about `average_method` in NMI

* Update docs from AMI, NMI changes (#1)

* Correct the NMI and AMI descriptions in docs

* Update docstrings due to averaging changes

- V-measure
- Homogeneity
- Completeness
- NMI
- AMI

* Update documentation and remove nose tests (#2)

* Update v0.20.rst

* Update test_supervised.py

* Update clustering.rst

* Fix multiple spaces after operator

* Rename all arguments

* No more arbitrary values!

* Improve handling of floating-point imprecision

* Clearly state when the change occurs

* Update AMI/NMI docs

* Update v0.20.rst

* Catch FutureWarnings in AMI and NMI
NelleV pushed a commit that referenced this pull request Sep 5, 2019
…13243)

* Remove unused code

* Squash all the PR 9040 commits

initial PR commit

seq_dataset.pyx generated from template

seq_dataset.pyx generated from template #2

rename variables

fused types consistency test for seq_dataset

a

sklearn/utils/tests/test_seq_dataset.py

new if statement

add doc

sklearn/utils/seq_dataset.pyx.tp

minor changes

minor changes

typo fix

check numeric accuracy only up 5th decimal

Address oliver's request for changing test name

add test for make_dataset and rename a variable in test_seq_dataset

* FIX tests

* TST more numerically stable test_sgd.test_tol_parameter

* Added benchmarks to compare SAGA 32b and 64b

* Fixing gael's comments

* fix

* solve some issues

* PEP8

* Address lesteve comments

* fix merging

* avoid using assert_equal

* use all_close

* use explicit ArrayDataset64 and CSRDataset64

* fix: remove unused import

* Use parametrized to cover ArrayDaset-CSRDataset-32-64 matrix

* for consistency use 32 first then 64 + add 64 suffix to variables

* it would be cool if this worked !!!

* more verbose version

* revert SGD changes as much as possible.

* Add solvers back to bench_saga

* make 64 explicit in the naming

* remove checking native python type + add comparison between 32 64

* Add whatsnew with everyone with commits

* simplify a bit the testing

* simplify the parametrize

* update whatsnew

* fix pep8
NelleV pushed a commit that referenced this pull request Sep 5, 2019
* initial commit

* used random class

* fixed failing testcases, reverted __init__.py

* fixed failing testcases #2
- passed rng as parameter to ParameterSampler class
- changed seed from 0 to 42 (as original)

* fixed failing testcases #2
- passed rng as parameter to SparseRandomProjection class

* fixed failing testcases #4
- passed rng as parameter to GaussianRandomProjection class

* fixed failing test case because of flake 8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants