Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Backend refactor #631

Merged
merged 2 commits into from
Feb 13, 2021
Merged

Conversation

MuawizChaudhary
Copy link
Collaborator

This PR is in progress. All the tests currently do not run.

Refactors the named tuple convention out of the 1d.

kymatio/backend/numpy_backend.py Outdated Show resolved Hide resolved
kymatio/backend/numpy_backend.py Outdated Show resolved Hide resolved
kymatio/backend/tensorflow_backend.py Outdated Show resolved Hide resolved
kymatio/backend/tensorflow_backend.py Outdated Show resolved Hide resolved
kymatio/backend/tensorflow_backend.py Outdated Show resolved Hide resolved
kymatio/scattering1d/backend/torch_backend.py Outdated Show resolved Hide resolved
tests/scattering1d/test_numpy_backend_1d.py Outdated Show resolved Hide resolved
tests/scattering1d/test_torch_backend_1d.py Outdated Show resolved Hide resolved
@@ -176,6 +177,8 @@ def test_unpad():
# test unpading of a random tensor
x = torch.randn(8, 4, 1)

Torch1DBackend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here?

kymatio/backend/numpy_backend.py Outdated Show resolved Hide resolved
kymatio/backend/tensorflow_backend.py Outdated Show resolved Hide resolved
kymatio/backend/torch_backend.py Outdated Show resolved Hide resolved
kymatio/scattering1d/backend/jax_backend.py Outdated Show resolved Hide resolved
kymatio/scattering1d/backend/numpy_backend.py Outdated Show resolved Hide resolved
tests/scattering1d/test_torch_backend_1d.py Show resolved Hide resolved
tests/scattering1d/test_torch_scattering1d.py Outdated Show resolved Hide resolved
@@ -0,0 +1,117 @@
import pytest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these tests from this pr or from the tests refactoring pr? I don't see why the tests would change with this refactoring

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the test refactoring PR. We should merge those as soon as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i believe this is merged, so I'm surprised to see the diff below

Copy link
Collaborator Author

@MuawizChaudhary MuawizChaudhary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SKcuda backend is bugging out a bit. in the 'scattering1d/backend/torch_skcuda_backend.py, TorckSKcuda1Ddoesn't properly overrideTorchBackend1D modulus and sub sampling Fourier functions. I've spent some time on it, did some research on the internet, and the functions should be overridden in the 'scattering1d/backend/torch_skcuda_backend.py script, but they aren't.

kymatio/backend/tensorflow_backend.py Outdated Show resolved Hide resolved
@eickenberg
Copy link
Collaborator

I'd like to discuss today the differences between making changes by subclassing and making changes by passing a constructor argument.

@janden
Copy link
Collaborator

janden commented Jul 22, 2020

@MuawizChaudhary Can you rebase this on top of dev? There's a lot of conflicts that need to be resolved and I don't want us to have to review it twice.

@MuawizChaudhary
Copy link
Collaborator Author

If you squashed all the changes, I'll re base it shortly!

@janden
Copy link
Collaborator

janden commented Jul 22, 2020

If you squashed all the changes, I'll re base it shortly!

The changes in dev from the 1D test PR? Yeah they're squashed.

@janden
Copy link
Collaborator

janden commented Jul 23, 2020

Tests fail. Please fix and we'll review.

Copy link
Collaborator

@eickenberg eickenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things I don't understand (see comments).

Also I am still not 100% convinced we should be using class instances here. Every method I see that takes self as an argument but doesn't use it seems like an unnecessarily class-tied function to me. I would prefer them to be class/static methods.

Can't the whole backend inheritance thing be pulled off without instantiating? Just by using class/static methods and overriding things appropriately?

return A * B

class NumpyBackend:
def __init__(self, name='numpy'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need name as an argument? Aren't we just going to override the __init__ and write the name in there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it so that we can specify whats the name of the backend. We dont have to if its nD numpy backend since we already specified the name here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to override it in the __init__. When would we ever need to specify the name of the backend at instantiation time? Also, what's the point of the name attribute anyhow?


class NumpyBackend:
def __init__(self, name='numpy'):
import numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please confirm: This is imported inline in order to be able to import this class into another file without triggering a numpy import?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it shouldn't trigger a numpy import unless we call the init functions.

I thought the best idea was to import it inline, I believe that might have been what @janden wanted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it better to import inline? I figured we would have a global import, but if that's not possible, we can do inline.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think nothing is really stopping us from using a global import.

I'd generally also like to get away as much as possible from making an instance. Ideally all functions are static on the class and we use inheritance to pass down those static functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think nothing is really stopping us from using a global import.

Agree. That's how it is for the other backends, as far as I can tell.

I'd generally also like to get away as much as possible from making an instance. Ideally all functions are static on the class and we use inheritance to pass down those static functions

I remember inheriting static functions as being slightly tricky to deal with when it comes to inheritance, but this may be a false memory. Overall, I think it's simple enough to have these singleton instances since it allows us to store some internal state (like these module references).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we just write those into the class?
I'm happy to keep any state that is necessary, but would avoid it if it's easy

kymatio/backend/tensorflow_backend.py Show resolved Hide resolved
raise TypeError('The input should be complex.')
class TensorFlowBackend(NumpyBackend):
def __init__(self):
super(TensorFlowBackend, self).__init__(name='tensorflow')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to me to pass name as an argument. It should be set and overridden in the __init__

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the line above?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.name = 'tensorflow'

kymatio/backend/tensorflow_backend.py Show resolved Hide resolved

self.contiguous_check(x)

def complex_check(self, x):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like these checks could totally be inherited from a baseclass if we wanted that. not sure if we do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is a smart idea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the base class actually.
But we might be able to use one more level of inheritance to deal with the two different complex number formulations

cublas.cublasSetStream(handle, stream)
cublas.cublasCdgmm(handle, 'l', m, n, A.data_ptr(), lda, B.data_ptr(), incx, C.data_ptr(), ldc)
return C
class TorchSKcudaBackend:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this one I would have expected to inherit from the torch backend

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we inherit the torch backend from the 1d torch backend when we create the TorchSKcuda 1D backend class. The torch skcuda base backend is also inherited.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend checking kymatio/scattering1d/backend/torch_skcuda_backend.py.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eickenberg, same thing happens for Jax (NP → NP1D → Jax1D) since we need to inherit the 1D-specific functions. Ideally, in the future, the backends will be less dimension-specific and we can have a simpler inheritance structure. Perhaps mixins could simplify the 1D/2D/3D specification when needed.

On another, I suggest TorchSkcudaBackend. Much less damaging to the eyes.

def irfft(x):
complex_check(x)
return scipy.fftpack.ifft(x).real
y = x.reshape(-1, k, x.shape[-1] // k)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the initial shape back. There is absolutely no reason not to.

y = x.reshape(x.shape[:-1] + (k, x.shape[-1] // k)).sum(-2)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this part, when I implement it as such, messes with a reshape downstream. I can't properly figure it out, so will leave this as is for now.

kymatio/scattering1d/backend/numpy_backend.py Show resolved Hide resolved
janden
janden previously requested changes Aug 11, 2020
Copy link
Collaborator

@janden janden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a few things to address, then we can extend this to 2D and 3D.

return A * B

class NumpyBackend:
def __init__(self, name='numpy'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to override it in the __init__. When would we ever need to specify the name of the backend at instantiation time? Also, what's the point of the name attribute anyhow?


class NumpyBackend:
def __init__(self, name='numpy'):
import numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it better to import inline? I figured we would have a global import, but if that's not possible, we can do inline.

raise TypeError('The input should be complex.')
class TensorFlowBackend(NumpyBackend):
def __init__(self):
super(TensorFlowBackend, self).__init__(name='tensorflow')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.name = 'tensorflow'

kymatio/backend/tensorflow_backend.py Show resolved Hide resolved
else:
contiguous_check(B)
def concatenate(self, arrays):
return torch.stack(arrays, dim=2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not dim=1 like in the NP backend?

kymatio/scattering1d/backend/numpy_backend.py Show resolved Hide resolved


def test_modulus(random_state=42):
"""
Tests the stability and differentiability of modulus
"""

backend = TorchBackend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just import backend from kymatio.backend.torch_backend? It's already been instantiated and that's what the main code will use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially now, since it's actually not instantiated

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though that said, it might be more modular to do it like above, just without the parentheses

@@ -1,6 +1,6 @@
import pytest
import numpy as np

import tensorflow as tf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed it and it still works, so no reason to keep it

@@ -106,14 +106,11 @@ def coefficent(n):
y_r = (x_r * coefficents).sum(-1)

z = backend.rfft(x_r)
# increase tolerance here as tensorflow fft is slightly inaccurate due to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this going away?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah why?

tests/scattering1d/test_torch_backend_1d.py Show resolved Hide resolved
@eickenberg
Copy link
Collaborator

OK, I just rebased this.

I will need some help to figure out what the intention was in some of these changes.

The global picture is that we are introducing classes for backends instead of the named tuples.

Please take a look at the changes I needed to make in my first commit on top of this to make the tests for numpy backend 2d pass: 8ac8a8b

basically I had to write a bunch of attributes to the instantiated class. I left the original commented for comparison.

Question: Why is it that we are defining half of the backend in the class and half in the 2d backend file? Is it because those parts are the specific stuff? Should we potentially be doing this using inheritance?

Happy for any input and opinions @janden @MuawizChaudhary

@eickenberg
Copy link
Collaborator

OK, I have kludged my way to making the tests pass.

For the specific backends I would suggest using inheritance rather than writing properties, since that is not much better than the named tuples we had before.

Please let me know if you think that makes sense and I'll make those changes. Ideally we get this one merged before the sprint (or at the latest at the beginning of it).

@janden
Copy link
Collaborator

janden commented Feb 10, 2021

Still fails, but for legit reasons (torch_skcuda backend has not been updated to the new architecture).

@janden janden force-pushed the backend_refactor branch 2 times, most recently from 35ae750 to 0738934 Compare February 10, 2021 16:00
@janden
Copy link
Collaborator

janden commented Feb 10, 2021

So tests pass now on Jenkins. Since it's been a while, I want to go over everything again to see what needs to be done before merging.

@eickenberg
Copy link
Collaborator

Awesome. I will read through the updated code now.
One thing that I have had simmering for a bit is the instantiation. Do we really need to instantiate these backends?
If not, we could set all the methods to staticmethod and remove the selfobjects because they are not needed

@edouardoyallon edouardoyallon self-assigned this Feb 11, 2021
@edouardoyallon
Copy link
Collaborator

the fft failure for python 3.6 only, with conda and tf sounds like some future promising fun!

self.complex_check(x)
self.contiguous_check(x)

def contiguous_check(self, x):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'm not happy with these unused self objects

@eickenberg
Copy link
Collaborator

@janden is it OK if I go through the backends and remove any unnecessary use of selfin function definitions? I.e. functions that do not depend on state should not need a self object as input

@janden
Copy link
Collaborator

janden commented Feb 11, 2021

@eickenberg So all the backends have been refactored now to have either class methods or stack methods. There are no object instantiations (as far as I can tell). Please have a look and let me know what you think.

Copy link
Collaborator

@eickenberg eickenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks close to mergeable. At this point I think if we address all the remaining comments, we should be good to go

return A * B

@classmethod
def cdgmm(cls, A, B):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is identical to the one in the numpy backend so should go through by inheritance

backend.compute_integrals = compute_integrals
backend.concatenate = concatenate

backend = TensorFlowBackend3D()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instantiation not necessary, correct?



def test_modulus(random_state=42):
"""
Tests the stability and differentiability of modulus
"""

backend = TorchBackend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially now, since it's actually not instantiated



def test_modulus(random_state=42):
"""
Tests the stability and differentiability of modulus
"""

backend = TorchBackend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though that said, it might be more modular to do it like above, just without the parentheses

@@ -0,0 +1,117 @@
import pytest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i believe this is merged, so I'm surprised to see the diff below

Copy link
Collaborator

@eickenberg eickenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I'm done for today.

I have addressed a number of outstanding code comments in the PR.

The remaining ones I am not sure how to address, but I have the feeling they are minor.

I did an interactive rebase in order to consolidate @MuawizChaudhary 's summer work with my commits from last week so that in aggregate the tests pass on that commit. Just in case this wasn't a good idea, I have the unsquashed branch fully backed up on my computer. You can also reconstruct the previous status quo by using what you have on your machine and adding my most recent commits on top.

There is one remaining puzzle: I found (and had commented on in the PR at some point) several changes that looked like they were from a previous test refactoring project. I had thought that these might be duplicates, but for most of them, I can't find a trace of a second version of them anywhere in the history. If I am right, then we should make sure to keep these commits, because they do work on the tests.

From my side this can be merged as is, since merging will just add the test commits, and the remaining comments are minor. However, it is also possible to address the comments and split off the TST commits in order to make this PR only about the backend refactor.

I am fine with either. Since it's the last sprint day tomorrow, maybe we can get this in and then try to add the jax backends?

def irfft(x):
complex_check(x)
return scipy.fftpack.ifft(x).real
y = x.reshape(-1, k, x.shape[-1] // k)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this part, when I implement it as such, messes with a reshape downstream. I can't properly figure it out, so will leave this as is for now.

@@ -1,6 +1,6 @@
import pytest
import numpy as np

import tensorflow as tf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed it and it still works, so no reason to keep it

@eickenberg
Copy link
Collaborator

OK, as discussed with @janden, I squashed all the straggling TST commits together and hilariously all those commits boiled down to these very few lines removing the tolerance from tensorflow fft test a6dee63

I will now proceed to remove this commit, because we don't actually want that change.

MuawizChaudhary and others added 2 commits February 12, 2021 20:05
This is a squash of many commits.
Muawiz had started this project as part of his summer internship.
In order to finish it and to make sure the commits in the git history pass tests we aggregate them here because taken separately they would not pass tests because the commits contain partial reorganization.

API adding refactor backends

MAINT include backend initalization at the end of each file. renamed each backend class

MAINT skcuda backend refactored

MAINT removed pass in torch backend

MAINT removal of jax backend

MAINT removed redundant line

STY white space added and removed

STY white space changes

MAINT removed BACKEND_NAME variable

MAINT added checks back into numpy backend

MAINT refactored checks back into torch backends

MAINT checks added back into tensorflow backend

STY removed parenthesis

MAINT fixed incorrect modulus call

MAINT add option to specify name for numpy backend

MAINT removed inplace operation, fixed torck_skcuda calling torch
backend functions

STY removed parenthesis

STY remove parenthesis

MAINT move modulus inside torch class

MAINT remove initalization of checks

MAINT define concatenate function in general backend

MAINT define concatenate function in general backend tensorflow

MAINT define concatenate function in general backend numpy

MAINT do numpy imports in backends

MAINT tensorflow now inheriates from numpy backend

TST changed error message we check for to appropriate one.

MAINT moved checks to top, moved concatenate above.

TST general backend test

TST remove print statement

FIX make tests pass for numpy backend 2D

MAINT make tests pass for all backends. Next step is probably to create subclasses for the different-dimensional backends

Co-authored-by: chaudhm <chaudhm@wwu.edu>
Co-authored-by: Joakim Andén <janden@kth.se>
MAINT Fix spelling of Skcuda backend class

TST Fix bug in 2D torch subsample tests

MAINT Update 2D Torch backend to use inheritance

MAINT Make torch_skcuda backend in 2D use inheritance

MAINT Make 3D torch backend use inheritance

MAINT Make 3D torch_skcuda backend use inheritance

FIX remove backend name as class init argument

ENH numpy 2d backend class

COSM remove commented code

ENH tensorflow 2d backend class

ENH numpy 3d backend class

ENH tensorflow 3d backend class

MAINT Update NP backends with class methods

MAINT Update TF backends with class methods

MAINT Update Torch backends to use class methods

MAINT remove cdgmm from tf backend because inherited from numpy backend

COSM remove commented code

MAINT make tf 3d backend use static and class methods

MAINT remove instantiation from torch backend test

MAINT remove tf import from tf 1d backend test

Co-authored-by: Joakim Andén <janden@kth.se>
@eickenberg eickenberg dismissed janden’s stale review February 13, 2021 01:34

these are change requests from july that have been addressed

@eickenberg eickenberg merged commit c43e498 into kymatio:dev Feb 13, 2021
@janden
Copy link
Collaborator

janden commented Feb 15, 2021

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants