Inheritance hierarchy #92
Conversation
|
||
@shape.setter | ||
def shape(self, value): | ||
self._shape = value |
mrocklin
Jan 27, 2018
Collaborator
Perhaps we should avoid the property altogether and just have this be a normal attribute?
Perhaps we should avoid the property altogether and just have this be a normal attribute?
hameerabbasi
Jan 27, 2018
Author
Collaborator
There's no way to inherit it if we do that. And we need shape
directly for ndim
, size
, and indirectly fordensity
.
There's no way to inherit it if we do that. And we need shape
directly for ndim
, size
, and indirectly fordensity
.
mrocklin
Jan 27, 2018
Collaborator
Maybe we just avoid listing it as an abstract method then?
Maybe we just avoid listing it as an abstract method then?
hameerabbasi
Jan 27, 2018
Author
Collaborator
Then there's no point to the base class. We would have to remove all of the properties I mentioned.
The whole point of the class is to access the "common" stuff for all types.
Then there's no point to the base class. We would have to remove all of the properties I mentioned.
The whole point of the class is to access the "common" stuff for all types.
hameerabbasi
Jan 27, 2018
Author
Collaborator
Or we can make it have an __init__
method with just the shape.
Or we can make it have an __init__
method with just the shape.
mrocklin
Jan 27, 2018
Collaborator
This seems like it might work? The super class doesn't necessarily need to be functional on its own.
class SparseArray(object):
@property
def ndim(self):
return len(self.shape)
@property
def size(self):
return reduce(operator.mul, self.shape, 1)
@property
def density(self):
return self.nnz / self.size
class COO(SparseArray):
def __init__(...):
self.shape = ...
This seems like it might work? The super class doesn't necessarily need to be functional on its own.
class SparseArray(object):
@property
def ndim(self):
return len(self.shape)
@property
def size(self):
return reduce(operator.mul, self.shape, 1)
@property
def density(self):
return self.nnz / self.size
class COO(SparseArray):
def __init__(...):
self.shape = ...
hameerabbasi
Jan 27, 2018
Author
Collaborator
It doesn't make sense in an structural way to be using things undefined in the parent class. However, I made it into an attribute of the parent class. This keeps it as an attribute and still makes sense in the inheritance hierarchy.
It doesn't make sense in an structural way to be using things undefined in the parent class. However, I made it into an attribute of the parent class. This keeps it as an attribute and still makes sense in the inheritance hierarchy.
9b6176a
to
43f850d
self[c] = d | ||
else: | ||
raise ValueError('data must be a dict.') | ||
|
||
@property | ||
def dtype(self): | ||
return self._dtype |
mrocklin
Jan 27, 2018
Collaborator
Thoughts on using an attribute rather than a property here? That seems simpler.
My guess is that we're doing this because the superclass has a property. In this case I would be inclined to just drop the property on the superclass.
Thoughts on using an attribute rather than a property here? That seems simpler.
My guess is that we're doing this because the superclass has a property. In this case I would be inclined to just drop the property on the superclass.
hameerabbasi
Jan 27, 2018
Author
Collaborator
COO
uses a property, whereas DOK
uses an attribute. We could make COO
use an attribute as well, however that seems bad since it might not match data.dtype
. I chose the lowest common denominator and went for a property.
COO
uses a property, whereas DOK
uses an attribute. We could make COO
use an attribute as well, however that seems bad since it might not match data.dtype
. I chose the lowest common denominator and went for a property.
mrocklin
Jan 27, 2018
Collaborator
Is it important that they match?
Is it important that they match?
hameerabbasi
Jan 27, 2018
Author
Collaborator
I want to have a base class that can be used to access common things. If I turn it into an attribute on DOK
then I have to take it out of the base class... Which I don't want to do. Anything that is common should be in the base class.
I want to have a base class that can be used to access common things. If I turn it into an attribute on DOK
then I have to take it out of the base class... Which I don't want to do. Anything that is common should be in the base class.
mrocklin
Jan 27, 2018
Collaborator
Anything that is common should be in the base class
I would agree with this to the extent that it's convenient. It looks like by trying to make these two subclasses adhere to the same interface we're forced to increase the level of technology used by one of the subclasses. I have a (probably irrational) aversion to using more advanced features than are necessary, mostly on the grounds of avoiding complex situations and increasing maintainability.
Anything that is common should be in the base class
I would agree with this to the extent that it's convenient. It looks like by trying to make these two subclasses adhere to the same interface we're forced to increase the level of technology used by one of the subclasses. I have a (probably irrational) aversion to using more advanced features than are necessary, mostly on the grounds of avoiding complex situations and increasing maintainability.
mrocklin
Jan 27, 2018
Collaborator
Although that would work for type inference, inheriting from SparseArray won't fail if we forget to implement dtype. Failures are good if we fail to add basic functionality.
I'm not particularly concerned about us forgetting to specify a dtype.
In addition, SparseArray.dtype would be inferred as NoneType in all IDEs instead of a np.dtype object.
What happens now? We don't specify the dtype in the property.
Although that would work for type inference, inheriting from SparseArray won't fail if we forget to implement dtype. Failures are good if we fail to add basic functionality.
I'm not particularly concerned about us forgetting to specify a dtype.
In addition, SparseArray.dtype would be inferred as NoneType in all IDEs instead of a np.dtype object.
What happens now? We don't specify the dtype in the property.
hameerabbasi
Jan 27, 2018
Author
Collaborator
We could, for example, add a docstring and it'd be inferred as a dtype.
We could, for example, add a docstring and it'd be inferred as a dtype.
mrocklin
Jan 28, 2018
Collaborator
I asked about the right way to do this on stack overflow:
I asked about the right way to do this on stack overflow:
hameerabbasi
Jan 28, 2018
•
Author
Collaborator
Great idea. However, I had another: How about a class docstring with an Attributes section?
Edit: I tried this out and it worked fine in PyCharm. :-)
Great idea. However, I had another: How about a class docstring with an Attributes section?
Edit: I tried this out and it worked fine in PyCharm. :-)
mrocklin
Jan 28, 2018
Collaborator
Glad to hear it!
Glad to hear it!
@@ -377,7 +343,7 @@ def todense(self): | |||
""" | |||
result = np.zeros(self.shape, dtype=self.dtype) | |||
|
|||
for c, d in six.iteritems(self.data): | |||
for c, d in self.data.items(): |
mrocklin
Jan 27, 2018
Collaborator
I'm +1 on this change, even if it causes more memory issues in Python 2.
I'm +1 on this change, even if it causes more memory issues in Python 2.
hameerabbasi
Jan 27, 2018
•
Author
Collaborator
With from builtins import dict
it won't, unless the user purposefully passes an old-style dict
. :-)
With from builtins import dict
it won't, unless the user purposefully passes an old-style dict
. :-)
@@ -1,31 +1,21 @@ | |||
import six | |||
from builtins import dict, int |
mrocklin
Jan 27, 2018
Collaborator
What is dict
here in Python 2? It's not a normal builtin dict object?
Ah, ok, I'm now reading about the future
package. I have to admit that I am a bit hesitant to add it as a new dependency just because I haven't heard of it before. Happy to deal with this in the future though.
What is dict
here in Python 2? It's not a normal builtin dict object?
Ah, ok, I'm now reading about the future
package. I have to admit that I am a bit hesitant to add it as a new dependency just because I haven't heard of it before. Happy to deal with this in the future though.
hameerabbasi
Jan 27, 2018
Author
Collaborator
future
is a dependency for many packages. It's more common than six
, I believe. I may be wrong.
It's pip
installable.
future
is a dependency for many packages. It's more common than six
, I believe. I may be wrong.
It's pip
installable.
mrocklin
Jan 27, 2018
Collaborator
future is a dependency for many packages. It's more common than six, I believe. I may be wrong.
This would make me much more comfortable. I would be somewhat surprised to learn this though.
I hesitate to have a dict
type anywhere in the code that is not the standard builtin dict type, both for performance and (more importantly) consistency reasons with the ecosystem.
future is a dependency for many packages. It's more common than six, I believe. I may be wrong.
This would make me much more comfortable. I would be somewhat surprised to learn this though.
I hesitate to have a dict
type anywhere in the code that is not the standard builtin dict type, both for performance and (more importantly) consistency reasons with the ecosystem.
hameerabbasi
Jan 27, 2018
•
Author
Collaborator
I just took a peek through the code for future
.
def items(self):
"""
On Python 2.7+:
D.items() -> a set-like object providing a view on D's items
On Python 2.6:
D.items() -> an iterator over D's items
"""
if ver == (2, 7):
return self.viewitems()
elif ver == (2, 6):
return self.iteritems()
elif ver >= (3, 0):
return self.items()
So performance isn't a concern.
I just took a peek through the code for future
.
def items(self):
"""
On Python 2.7+:
D.items() -> a set-like object providing a view on D's items
On Python 2.6:
D.items() -> an iterator over D's items
"""
if ver == (2, 7):
return self.viewitems()
elif ver == (2, 6):
return self.iteritems()
elif ver >= (3, 0):
return self.items()
So performance isn't a concern.
mrocklin
Jan 27, 2018
Collaborator
What if some other infrastrucural library (like cloudpickle, Cython, PyPy, Numba, ...) has an optimiziation that works well on objects where type(obj) is dict
(this is almost certainly the case in all of those projects). Then presumably that optimization won't apply to the DOK class.
This library is aimed to be infrastructural for the ecosystem. I think that it is important that we be as non-fancy as possible. As another point, if you ever want to see this upstreamed into scipy
then you'll need to convince them about this point as well and they are, in general, far more conservative than I am.
What if some other infrastrucural library (like cloudpickle, Cython, PyPy, Numba, ...) has an optimiziation that works well on objects where type(obj) is dict
(this is almost certainly the case in all of those projects). Then presumably that optimization won't apply to the DOK class.
This library is aimed to be infrastructural for the ecosystem. I think that it is important that we be as non-fancy as possible. As another point, if you ever want to see this upstreamed into scipy
then you'll need to convince them about this point as well and they are, in general, far more conservative than I am.
mrocklin
Jan 27, 2018
Collaborator
Yeah, it's a nice trick, don't get me wrong. I've just seen many many things go wrong once code like this interacts with 1000 other libraries (which again, I think is our intention). I would definitely use a library like this in application code, but for core ecosystem code I tend to be more conservative. I think that many other maintainers of core ecosystem code also hold this belief.
Yeah, it's a nice trick, don't get me wrong. I've just seen many many things go wrong once code like this interacts with 1000 other libraries (which again, I think is our intention). I would definitely use a library like this in application code, but for core ecosystem code I tend to be more conservative. I think that many other maintainers of core ecosystem code also hold this belief.
hameerabbasi
Jan 27, 2018
Author
Collaborator
Hmm. I'd be okay going back to six
, but the memory constrains could get bad. What do you think?
Hmm. I'd be okay going back to six
, but the memory constrains could get bad. What do you think?
hameerabbasi
Jan 27, 2018
Author
Collaborator
I apologize if this causes friction.
Not at all. :-) We're just discussing.
I apologize if this causes friction.
Not at all. :-) We're just discussing.
mrocklin
Jan 28, 2018
Collaborator
I would wait for someone to arrive with a memory issue before optimizing for memory constraints in this way. I don't expect this to be a significant issue in common use. Also, I don't mind Python 2 being less efficient.
I would wait for someone to arrive with a memory issue before optimizing for memory constraints in this way. I don't expect this to be a significant issue in common use. Also, I don't mind Python 2 being less efficient.
hameerabbasi
Jan 28, 2018
•
Author
Collaborator
I removed the six
/future
dependency entirely and added our own basic Python 2/3 compat, however, the dict issue stands.
I removed the six
/future
dependency entirely and added our own basic Python 2/3 compat, however, the dict issue stands.
If there are no other objections, I think this gets a +1 from me. |
This is a more common name for this file in standard projects.
I've renamed py23.py to compatibility.py and pushed to this branch (I hope that's ok). +1 from me to merge |
Thanks! |
* Implement inheritance hierarchy. * Implement inheritance hierarchy. * Make shape an attribute and move it to the base class. * Delete useless properties. * Make dtype into a property of the parent class. * Removed six dependency and added some basic memory Py 2/3 optimisations. * Documentation for SparseArray. * Make dtype into an attribute for DOK. * Rename py23 to compatibility This is a more common name for this file in standard projects.
six
tofuture
(I think it's cleaner, with a few imports we can write Python 2/3 compatible code instead of modifying everything)SparseArray
class with the properties common to bothCOO
andDOK
.COO
andDOK
inherit fromSparseArray
.