Add float32 support for SGDClassifier and SGDRegressor #5776

elanmart · 2015-11-10T00:33:19Z

Hi,

I was wondreing why sklearn does not allow me to specify the dtype I'd like to use.

I'm working with rather large dataset, which fits into my RAM as float32, but when I'm trying to train a simple SGD on it, the model tries to copy my data into float64, causing MemoryError.

I can change this in my local sklearn build, but I guess there is a good reason why this is not a free parameter?

The text was updated successfully, but these errors were encountered:

amueller · 2015-11-12T18:04:28Z

PR welcome ;)
It would be great to have both 32 and 64 bit everywhere, I think.
This means using fused types everywhere in Cython.

One of the reasons not to do that was the explosion in the generated C code, but I think with #5492 we need to be somewhat less careful about that.

lorentzenchr · 2023-02-09T10:17:40Z

#13243 implemented float32 support for SequentialDataset. This has to be propagated to _plain_sgd in _sgd_fast.pyx (and other places) in order to enable to preserve float32 X ndarrays in SGDClassifier and SGDRegressor.

jjerphan · 2023-02-09T15:24:04Z

@OmarManzoor: you might be interested in this slightly harder Cython issue.

OmarManzoor · 2023-02-09T15:30:15Z

@jjerphan Thank you for suggesting.

OmarManzoor · 2023-02-10T09:15:48Z

@jjerphan Do we need to change the _sgd_fast.pxd file in relation to this? On checking its usages it is mainly being used inside the _sag_fast module which I think already supports float32 and float64.

OmarManzoor · 2023-02-10T09:57:52Z

@jjerphan Do we need to change the _sgd_fast.pxd file in relation to this? On checking its usages it is mainly being used inside the _sag_fast module which I think already supports float32 and float64.

On further investigation I don't think we need to change this file.

ssaeger mentioned this issue Feb 23, 2016

[MRG+1] Allows KMeans/MiniBatchKMeans to use float32 internally by using cython fused types #6430

Closed

yenchenlin mentioned this issue Mar 14, 2016

[MRG+1] Use fused type in inplace normalize #6539

Merged

yenchenlin mentioned this issue Apr 9, 2016

[MRG+2] Use fused types in sparse mean variance functions #6593

Merged

jnothman mentioned this issue Apr 26, 2016

Add fused type to Cython files #5973

Closed

yenchenlin mentioned this issue Jun 14, 2016

[WIP] Make SGD support Cython fused types #6889

Closed

yenchenlin mentioned this issue Aug 30, 2016

Remove DataConvergenceWarning from KMeans? #7256

Closed

rth mentioned this issue Mar 19, 2018

Incorrect Clusters Due To Dtype Mismatch #10832

Closed

rth mentioned this issue Mar 26, 2019

Support large sparse matrices in SGD* and SequentialDataset #11355

Open

serralba mentioned this issue Nov 12, 2020

MemoryError on large binary datasets #18825

Closed

thomasjpfan added Performance Hard Hard level of difficulty labels Feb 27, 2022

lorentzenchr changed the title ~~Working with float32 data~~ Add float32 support for SGDClassifier and SGDRegressor Feb 9, 2023

lorentzenchr added Enhancement help wanted module:linear_model cython float32 Issues related to support for 32bit data Performance and removed Performance Hard Hard level of difficulty labels Feb 9, 2023

OmarManzoor mentioned this issue Feb 10, 2023

ENH Support float32 in SGDClassifier and SGDRegressor #25587

Merged

thomasjpfan closed this as completed in #25587 Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add float32 support for SGDClassifier and SGDRegressor #5776

Add float32 support for SGDClassifier and SGDRegressor #5776

elanmart commented Nov 10, 2015

amueller commented Nov 12, 2015

lorentzenchr commented Feb 9, 2023

jjerphan commented Feb 9, 2023

OmarManzoor commented Feb 9, 2023

OmarManzoor commented Feb 10, 2023

OmarManzoor commented Feb 10, 2023

Add float32 support for SGDClassifier and SGDRegressor #5776

Add float32 support for SGDClassifier and SGDRegressor #5776

Comments

elanmart commented Nov 10, 2015

amueller commented Nov 12, 2015

lorentzenchr commented Feb 9, 2023

jjerphan commented Feb 9, 2023

OmarManzoor commented Feb 9, 2023

OmarManzoor commented Feb 10, 2023

OmarManzoor commented Feb 10, 2023