Fixes for CHOP #14

GeoffNN · 2021-03-26T10:08:42Z

Added data normalizing with StandardScaler
Modified the loss function to exactly match the loss function in objective.py

This doesn't fix the issue mentioned here.

This still needs work for the stochastic case.

agramfort · 2021-03-26T10:10:05Z

solvers/chop.py

-        self.X = torch.tensor(X).to(device)
-        self.y = torch.tensor(y > 0, dtype=torch.float64).to(device)
+        scaler = StandardScaler()
+        X = scaler.fit_transform(X)


this should not be done in the solver but in the dataset as it changes the problem

Can it be an option in the dataset?

the libsvmdata package does some preprocessing like this. I would just offer the good dataset (normalized or not).

Right, it seems this was the cause of the infinity values... (Since the scaler wasn't applied at eval time) Modifying.

@GeoffNN dont change the solver but the dataset. Otherwise solvers will not be comparable

Done; I added standardization as a parameter in Covtype, so it runs on both versions of the dataset. We can also just keep the standardized version if you prefer.

agramfort

cool !

do you have any new figure you can share?

thx a lot @GeoffNN

agramfort · 2021-03-28T09:56:57Z

solvers/chop.py

+    requirements = ['pip:https://github.com/openopt/chop/archive/master.zip',
+                    'pip:scikit-learn']


Suggested change

requirements = ['pip:https://github.com/openopt/chop/archive/master.zip',

'pip:scikit-learn']

requirements = ['pip:https://github.com/openopt/chop/archive/master.zip']

agramfort · 2021-03-28T09:58:46Z

datasets/covtype.py

+
+        if self.standardized:
+            scaler = StandardScaler()
+            X = scaler.fit_transform(X)


I would simplify and always standardize.

Ok -- only for Covtype? It would make sense to do it for all non-sparse datasets, right?

GeoffNN · 2021-03-28T10:14:06Z

I'm still working on the stochastic variant -- I'll share the figures for both ASAP

agramfort · 2021-03-28T10:27:26Z

I don’t know about Madelon but the others should be already scaled

Added data normalization; changed loss function to match objective.py

1d2c999

GeoffNN marked this pull request as draft March 26, 2021 10:09

agramfort reviewed Mar 26, 2021

View reviewed changes

GeoffNN added 6 commits March 28, 2021 01:15

Add standardize parameter

778883e

Fixed full batch solver

61d9b54

stochastic now can stop

f9dbca8

run with non stochastic

a82a0c6

dimension fix matmul

092e92e

Run with full batch

5c7943b

agramfort reviewed Mar 28, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for CHOP #14

Fixes for CHOP #14

GeoffNN commented Mar 26, 2021 •

edited

Loading

agramfort Mar 26, 2021

GeoffNN Mar 27, 2021

agramfort Mar 27, 2021

GeoffNN Mar 28, 2021 •

edited

Loading

agramfort Mar 28, 2021

GeoffNN Mar 28, 2021 •

edited

Loading

agramfort left a comment

agramfort Mar 28, 2021

agramfort Mar 28, 2021

GeoffNN Mar 28, 2021

GeoffNN commented Mar 28, 2021

agramfort commented Mar 28, 2021

		requirements = ['pip:https://github.com/openopt/chop/archive/master.zip',
		'pip:scikit-learn']

Fixes for CHOP #14

Are you sure you want to change the base?

Fixes for CHOP #14

Conversation

GeoffNN commented Mar 26, 2021 • edited Loading

agramfort Mar 26, 2021

Choose a reason for hiding this comment

GeoffNN Mar 27, 2021

Choose a reason for hiding this comment

agramfort Mar 27, 2021

Choose a reason for hiding this comment

GeoffNN Mar 28, 2021 • edited Loading

Choose a reason for hiding this comment

agramfort Mar 28, 2021

Choose a reason for hiding this comment

GeoffNN Mar 28, 2021 • edited Loading

Choose a reason for hiding this comment

agramfort left a comment

Choose a reason for hiding this comment

agramfort Mar 28, 2021

Choose a reason for hiding this comment

agramfort Mar 28, 2021

Choose a reason for hiding this comment

GeoffNN Mar 28, 2021

Choose a reason for hiding this comment

GeoffNN commented Mar 28, 2021

agramfort commented Mar 28, 2021

GeoffNN commented Mar 26, 2021 •

edited

Loading

GeoffNN Mar 28, 2021 •

edited

Loading

GeoffNN Mar 28, 2021 •

edited

Loading