strawman implementation of AdaptiveSeq #85

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
3 participants
@avibryant
Collaborator

avibryant commented Dec 19, 2012

See #84

@johnynek

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 21, 2012

Collaborator

This looks like an excellent strawman.

Two small points:

  1. The Monoid should probably be in the companion object, since Monoid[AdaptiveSeq[T]] is well defined (even group/ring for that matter).
  2. I think you need a specialized .updated method which does a merge with a Single.

Also, I'm not sure I like the name AdaptiveSeq. SparseIndexedSeq would be my choice, even thought not all representations are actually sparse (Spare with fallback to dense). But I'm not super concerned either way.

Collaborator

johnynek commented Dec 21, 2012

This looks like an excellent strawman.

Two small points:

  1. The Monoid should probably be in the companion object, since Monoid[AdaptiveSeq[T]] is well defined (even group/ring for that matter).
  2. I think you need a specialized .updated method which does a merge with a Single.

Also, I'm not sure I like the name AdaptiveSeq. SparseIndexedSeq would be my choice, even thought not all representations are actually sparse (Spare with fallback to dense). But I'm not super concerned either way.

@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 21, 2012

Collaborator

I'm somewhat confused by both of your points.

  1. Do you really mean Monoid[AdaptiveSeq[T]], or do you mean something like Monoid[AdaptiveSeq[T].Base], which is what AdaptiveSeq's monoid method returns? If the former, what would that even mean? And either way, what do companion objects have to do with it?

  2. do you just mean something like this, on Base?

def updated(idx : Int, value : T) = merge(Single(idx, value))
Collaborator

avibryant commented Dec 21, 2012

I'm somewhat confused by both of your points.

  1. Do you really mean Monoid[AdaptiveSeq[T]], or do you mean something like Monoid[AdaptiveSeq[T].Base], which is what AdaptiveSeq's monoid method returns? If the former, what would that even mean? And either way, what do companion objects have to do with it?

  2. do you just mean something like this, on Base?

def updated(idx : Int, value : T) = merge(Single(idx, value))
@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 22, 2012

Collaborator

Oh, as for name, I'm not fussy. SparseIndexedSeq is fine, even though as you point out it's slightly misleading.

Collaborator

avibryant commented Dec 22, 2012

Oh, as for name, I'm not fussy. SparseIndexedSeq is fine, even though as you point out it's slightly misleading.

@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 22, 2012

Collaborator

Ok, renamed and added updated(), and addressed aaron's point about bounds checking.

Collaborator

avibryant commented Dec 22, 2012

Ok, renamed and added updated(), and addressed aaron's point about bounds checking.

@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 22, 2012

Collaborator

Oh, wait. That's not the right semantics for updated() at all.

Collaborator

avibryant commented Dec 22, 2012

Oh, wait. That's not the right semantics for updated() at all.

+ case Empty => this
+ case Single(i, v) => mergeSingle(i,v)
+ case Sparse(m) => mergeSparse(m)
+ case d:Dense => d.mergeSparse(map)

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 22, 2012

Collaborator

this is reversing the order of the plus, right?

So, if the + is not commutative, this will break, won't it?

@johnynek

johnynek Dec 22, 2012

Collaborator

this is reversing the order of the plus, right?

So, if the + is not commutative, this will break, won't it?

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 22, 2012

Collaborator

Ah, true, I was assuming commutativity, which I shouldn't have been. That'll make the code uglier :(

@avibryant

avibryant Dec 22, 2012

Collaborator

Ah, true, I was assuming commutativity, which I shouldn't have been. That'll make the code uglier :(

+ def updated(i : Int, v : T) : IndexedSeq[T] = Dense(vec.updated(i, v))
+ }
+
+ val monoid = new Monoid[Base] {

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 22, 2012

Collaborator

I don't think scala will be able to find this Monoid implicitly without additional code.

Can you remind me again why you don't lift the SparseIndexedSeq trait up by itself, then have private [algebird] instances that you have here, and use the SparseIndexedSeq companion object to create Empty and get started from there?

With that approach, an implicit def monoid[T:Monoid]: Monoid[SparseIndexedSeq[T]] would be found automatically (and code like Monoid.plus(v1, v2) would just work).

@johnynek

johnynek Dec 22, 2012

Collaborator

I don't think scala will be able to find this Monoid implicitly without additional code.

Can you remind me again why you don't lift the SparseIndexedSeq trait up by itself, then have private [algebird] instances that you have here, and use the SparseIndexedSeq companion object to create Empty and get started from there?

With that approach, an implicit def monoid[T:Monoid]: Monoid[SparseIndexedSeq[T]] would be found automatically (and code like Monoid.plus(v1, v2) would just work).

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 22, 2012

Collaborator

I can see getting an implicit Semigroup with that approach, but how do you define zero? You need to know the total length (and maybe the sparse threshold).

For the same reason, you don't have an implicit HyperLogLog Monoid either, do you?

@avibryant

avibryant Dec 22, 2012

Collaborator

I can see getting an implicit Semigroup with that approach, but how do you define zero? You need to know the total length (and maybe the sparse threshold).

For the same reason, you don't have an implicit HyperLogLog Monoid either, do you?

This comment has been minimized.

Show comment Hide comment
@aaron-siegel

aaron-siegel Dec 26, 2012

Contributor

... and the optimal choice of density cutoff is also sensitive to T. I was envisioning that the user would put

implicit val myMonoid = new SparseIndexedSeq[MyType](myLen, mySparseFactor)

somewhere in scope before using this feature.

@aaron-siegel

aaron-siegel Dec 26, 2012

Contributor

... and the optimal choice of density cutoff is also sensitive to T. I was envisioning that the user would put

implicit val myMonoid = new SparseIndexedSeq[MyType](myLen, mySparseFactor)

somewhere in scope before using this feature.

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 29, 2012

Collaborator

Good point. A default Monoid on this doesn't make sense.

@johnynek

johnynek Dec 29, 2012

Collaborator

Good point. A default Monoid on this doesn't make sense.

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 29, 2012

Collaborator

Agree.

@johnynek

johnynek Dec 29, 2012

Collaborator

Agree.

@aaron-siegel

This comment has been minimized.

Show comment Hide comment
@aaron-siegel

aaron-siegel Dec 26, 2012

Contributor

How about SwitchingIndexedSeq? (I agree that AdaptiveSeq is a bit grandiose)

Contributor

aaron-siegel commented Dec 26, 2012

How about SwitchingIndexedSeq? (I agree that AdaptiveSeq is a bit grandiose)

@aaron-siegel

This comment has been minimized.

Show comment Hide comment
@aaron-siegel

aaron-siegel Dec 26, 2012

Contributor

Another idea would be to call it "SwitchingSeqFamily" or some such. Calling the enclosing class a "Seq" is a bit of a misnomer; there's an argument that a class shouldn't be suffixed "IndexedSeq" unless it actually inherits the IndexedSeq trait. One proposal is

SparseIndexedSeq -> SwitchingSeqFamily (or SparseSeqFamily or some such)
Base -> Element
Empty -> EmptyElement
Single -> SingletonElement
Sparse -> SparseElement
Dense -> DenseElement

which is perhaps more self-documenting nomenclature.

Contributor

aaron-siegel commented Dec 26, 2012

Another idea would be to call it "SwitchingSeqFamily" or some such. Calling the enclosing class a "Seq" is a bit of a misnomer; there's an argument that a class shouldn't be suffixed "IndexedSeq" unless it actually inherits the IndexedSeq trait. One proposal is

SparseIndexedSeq -> SwitchingSeqFamily (or SparseSeqFamily or some such)
Base -> Element
Empty -> EmptyElement
Single -> SingletonElement
Sparse -> SparseElement
Dense -> DenseElement

which is perhaps more self-documenting nomenclature.

@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 28, 2012

Collaborator

I think I like SparseSeqFamily best. I don't see the need to rename the internal classes, but if I did rename them I would prefer the suffix Seq to Element.

Apart from naming, are people happy with the current patch?

Collaborator

avibryant commented Dec 28, 2012

I think I like SparseSeqFamily best. I don't see the need to rename the internal classes, but if I did rename them I would prefer the suffix Seq to Element.

Apart from naming, are people happy with the current patch?

+
+ def mergeSingle(i : Int, v : T) = {
+ if(i == index)
+ Single(i, tMonoid.plus(v, value))

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 29, 2012

Collaborator

This is also backwards for a non-commutative monoid.

PS: Can you add a test of the monoid laws on this using the T = String? That monoid is not commutative, and it should expose any problems we have there.

@johnynek

johnynek Dec 29, 2012

Collaborator

This is also backwards for a non-commutative monoid.

PS: Can you add a test of the monoid laws on this using the T = String? That monoid is not commutative, and it should expose any problems we have there.

@johnynek

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Dec 29, 2012

Collaborator

I'm not crazy about the pattern of keeping this classes as inner classes. I'd rather they not be (I'm a little afraid of how Kryo is going to interact with that, because inner classes of immutable objects give it some problems sometimes).

So, I'd rather the inner classes be lifted out.

That said, if anyone else like's Avi's approach better, then I'll accept the majority rule.

Collaborator

johnynek commented Dec 29, 2012

I'm not crazy about the pattern of keeping this classes as inner classes. I'd rather they not be (I'm a little afraid of how Kryo is going to interact with that, because inner classes of immutable objects give it some problems sometimes).

So, I'd rather the inner classes be lifted out.

That said, if anyone else like's Avi's approach better, then I'll accept the majority rule.

@aaron-siegel

This comment has been minimized.

Show comment Hide comment
@aaron-siegel

aaron-siegel Dec 29, 2012

Contributor

I prefer the naming convention: X for the base trait, and ZeroX, SingleX, SparseX, DenseX for the implementations. I'm less particular about what X is, though I have a mild preference for Element over Seq. I'll explain.

I borrowed the term "Family" from GAP, in which a "Family" is a collection of algebraic elements that can be combined with one another in various ways, but not with elements of another family. (The set of elements of a specific group is the canonical example of a family.) It's a fairly broad concept, but one that maps perfectly to the present scenario. Here each instance of the enclosing class defines a family, whose elements are instances of its nested types.

This approach works so well here that it's inspired me to think about various other uses of this design pattern. For example, we could create a class PermutationFamily(n: Int), whose elements are permutations on n letters, equipped with a corresponding Group structure. I think this would be right at home in algebird.

In this grand generalization of avi's design pattern, there are numerous specialized XFamily implementations, each with a nested XFamily.Element type. I think this convention works better than XFamily.X (or some such) since it highlights the common structure across different Family implementations. It's quite analogous to the Enumeration pattern in scala, in which every class XEnumeration has an XEnumeration.Value type.

Of course, this is all highly speculative and presumes that (1) there really is value to building out other Family implementations in algebird, and drawing attention to their common structure; and (2) Oscar's practical concerns over this design pattern (serialization etc) can be overcome.

But in principle I'm very attracted to avi's proposal here. I think the idea of an enclosing Family instance that binds its elements within the type system (rather than in an ad hoc way at runtime) is really quite beautiful, and has many potential applications beyond this one.

Contributor

aaron-siegel commented Dec 29, 2012

I prefer the naming convention: X for the base trait, and ZeroX, SingleX, SparseX, DenseX for the implementations. I'm less particular about what X is, though I have a mild preference for Element over Seq. I'll explain.

I borrowed the term "Family" from GAP, in which a "Family" is a collection of algebraic elements that can be combined with one another in various ways, but not with elements of another family. (The set of elements of a specific group is the canonical example of a family.) It's a fairly broad concept, but one that maps perfectly to the present scenario. Here each instance of the enclosing class defines a family, whose elements are instances of its nested types.

This approach works so well here that it's inspired me to think about various other uses of this design pattern. For example, we could create a class PermutationFamily(n: Int), whose elements are permutations on n letters, equipped with a corresponding Group structure. I think this would be right at home in algebird.

In this grand generalization of avi's design pattern, there are numerous specialized XFamily implementations, each with a nested XFamily.Element type. I think this convention works better than XFamily.X (or some such) since it highlights the common structure across different Family implementations. It's quite analogous to the Enumeration pattern in scala, in which every class XEnumeration has an XEnumeration.Value type.

Of course, this is all highly speculative and presumes that (1) there really is value to building out other Family implementations in algebird, and drawing attention to their common structure; and (2) Oscar's practical concerns over this design pattern (serialization etc) can be overcome.

But in principle I'm very attracted to avi's proposal here. I think the idea of an enclosing Family instance that binds its elements within the type system (rather than in an ad hoc way at runtime) is really quite beautiful, and has many potential applications beyond this one.

@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Dec 29, 2012

Collaborator

So, assuming Kryo can cope with it, I'm with Aaron that I think this is an elegant and generally applicable pattern. Oscar, how do we make sure Kryo is ok? The way you say "give it some problems sometimes" makes it sound like it's not straightforward to reproduce.

Collaborator

avibryant commented Dec 29, 2012

So, assuming Kryo can cope with it, I'm with Aaron that I think this is an elegant and generally applicable pattern. Oscar, how do we make sure Kryo is ok? The way you say "give it some problems sometimes" makes it sound like it's not straightforward to reproduce.

@johnynek

This comment has been minimized.

Show comment Hide comment
@johnynek

johnynek Apr 8, 2013

Collaborator

So, Avi, I hope you don't mind that I reimplemented this approach in a way more to my liking when we needed SketchMap.

Can we close the request?

Collaborator

johnynek commented Apr 8, 2013

So, Avi, I hope you don't mind that I reimplemented this approach in a way more to my liking when we needed SketchMap.

Can we close the request?

@avibryant

This comment has been minimized.

Show comment Hide comment
@avibryant

avibryant Apr 8, 2013

Collaborator

I'm horribly offended, of course, but yes I'll close the request.

Collaborator

avibryant commented Apr 8, 2013

I'm horribly offended, of course, but yes I'll close the request.

@avibryant avibryant closed this Apr 8, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment