Add a non-sparse MapMonoid/Semigroup/Group/etc... #450

johnynek · 2015-06-12T00:10:42Z

plus(Map("hey" -> 0), Map()) == Map() which surprises people (that zeros are removed). In many cases, you want this, in many cases you don't.

We could change the default, but that might break people, or we could add something like:

import com.twitter.algebird.MapAlgebra.dense._

to opt in to dense versions of all the objects.

The text was updated successfully, but these errors were encountered:

johnynek · 2015-06-12T00:19:00Z

Work around now by doing a .mapValues(Option(_)) and this will make it dense, but that's ugly.

ianoc · 2015-06-12T01:30:34Z

It's also likely slower too I suppose. More allocations of the options. I
think importing the sense would be good

On Thursday, June 11, 2015, P. Oscar Boykin notifications@github.com
wrote:

Work around now by doing a .mapValues(Option(_)) and this will make it
dense, but that's ugly.

—
Reply to this email directly or view it on GitHub
#450 (comment).

jnievelt · 2015-06-12T04:26:30Z

I disagree with the use of dense/spare in this context. It's more a matter of determining which keys are valid. Dense and sparse are typically not a functional distinction like that. For example, both dense and sparse vectors will have a size property with the understanding that indices in [0, size) are valid.

One could also imagine a 'sparse' Map-like Monoid (or Map Aggregator) which, instead of storing full associative pairs (x, zero) as they occur, stores a separate Set of keys which map to zero along with the Map of keys that map to non-zero.

avibryant · 2015-06-12T05:10:19Z

Agree with @jnievelt on the terminology being wrong there; "dense" to me means that the keys have a known, finite domain and that storage is pre-allocated for all of them.

The Option workaround wouldn't work, would it? Because the zero for the Option monoid is None, which means the Nones will get dropped just like any other 0 would.

jnievelt · 2015-06-12T05:13:07Z

edit: actually I think @avibryant would be right, if the MapGroup is using OptionGroup. Does that mean it doesn't use OptionGroup (because there is a OptionMonoid)? Seems like we want a similar thing to happen for Map (i.e., the Group has to be explicitly requested).

I guess I'm not entirely sure how all these things get pulled in anyway.

avibryant · 2015-06-12T05:21:28Z

Oh right, of course.

One term that comes to mind, instead of dense, is monotonic - in the sense that the size of the vectors does not ever decrease as the result of an addition (or subtraction).

jnievelt · 2015-06-12T05:29:32Z

Oh, I see that even MapMonoid has this Group-like behavior (it's just missing negate). So it would use OptionMonoid and everything is good.

johnynek · 2015-06-12T21:17:44Z

the Option wrapper does work (or seemed to when I tried it at the REPL). I think the reason is because Monoid[Option[T]].isNonZero does not recurse: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/Monoid.scala#L61 just checks to see if it is None.
dense may not be a great name, but considering Map(k -> 0) to be equivalent to Map() is definitely considering the Map to be sparse, in my mind, which is to say, we assume an infinite vector of zeros.

This comes down to this: is this an infinite sparse vector of (K -> Option[V]) where None values are not stored. Or is it an infinite sparse vector of K -> V where there is a zero value in V that is not stored. I think it is totally convention because each can simulate the other.

Monotonic is good, I might go futher: monotonicSize to be explicit what is monotonic here.

Gabriella439 · 2015-06-15T17:32:55Z

Another approach is to wrap the Map in an opaque class that quotients the difference between a key with a zero value an empty key. The idea is that this wrapper would provide a new lookup method that would return Monoid.zero if the key is absent, so that the user cannot detect the difference between the two representations.

ianoc · 2015-06-15T17:36:47Z

deleting is more memory efficient, so i think we'll want to be additive
with anything that keeps more state around

On Mon, Jun 15, 2015 at 10:33 AM, Gabriel Gonzalez <notifications@github.com

wrote:

Another approach is to wrap the Map in an opaque class that quotients the
difference between a key with a zero value an empty key. The idea is that
this wrapper would provide a new lookup method that would return
Monoid.zero if the key is absent, so that the user cannot detect the
difference between the two representations.

—
Reply to this email directly or view it on GitHub
#450 (comment).

jnievelt · 2015-06-15T17:49:08Z

The point of this issue is not just that the MapMonoid is confusing, but also that the functionality it's missing (preserving keys that map to monoid.zero) is desired. So presenting a different interface that makes this impossible doesn't really help.

But yes, it should be additive. It could easily be that applications depend on assumptions like map.contains(k) <=> map(k) != monoid.zero, map.size == map.count { _._2 != monoid.zero }, etc.

Gabriella439 · 2015-06-15T18:09:14Z

Whether or not you use Option you still break the identity laws. One of the consequences of the Monoid laws is that the identity element must be unique. If you have two proposed non-equal identity elements (let's call them i1 and i2) then one of them must necessarily not be the identity.

To see why, you just add i1 and i2 together and whichever one you get back is not the identity. For example, if:

i1 + i2 = i1

... then i1 is not the identity element because i1 + i2 != i2

The only way to truly fix this issue is to have exactly one identity element or quotient the representation so that the user can only distinguish one identity element. The Option solution has the exact same problem because you can't have both Map(k -> None) and Map() both be identity elements.

jnievelt · 2015-06-15T18:19:34Z

Maybe I'm missing something, but isn't Map(k -> None) not an identity candidate because Map(k -> None) + Map() != Map()?

Gabriella439 · 2015-06-15T18:29:54Z

@jnievelt That means that Map() is not a valid identity, because the identity laws say that:

x + zero = x

... but the law does not hold when zero = Map() and x = Map(k -> None)

jnievelt · 2015-06-15T19:08:51Z

Ah, of course you're right. In fact the result is that there are no identity elements, except for degenerate cases like Map[Nothing, Long].

johnynek · 2015-06-15T19:16:24Z

@Gabriel439 the laws require a definition of =. If your definition of = includes only looking at non-zero keys, you have no problem, because really you had zero + zero = zero for these cases.

Gabriella439 · 2015-06-15T20:03:17Z

@johnynek I agree. That's why I believe we should wrap the Map type so that we can quotient the API and then the user wouldn't be able to distinguish multiple identity elements.

sritchie added enhancement beginner labels Oct 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a non-sparse MapMonoid/Semigroup/Group/etc... #450

Add a non-sparse MapMonoid/Semigroup/Group/etc... #450

johnynek commented Jun 12, 2015

johnynek commented Jun 12, 2015

ianoc commented Jun 12, 2015

jnievelt commented Jun 12, 2015

avibryant commented Jun 12, 2015

jnievelt commented Jun 12, 2015

avibryant commented Jun 12, 2015

jnievelt commented Jun 12, 2015

johnynek commented Jun 12, 2015

Gabriella439 commented Jun 15, 2015

ianoc commented Jun 15, 2015

jnievelt commented Jun 15, 2015

Gabriella439 commented Jun 15, 2015

jnievelt commented Jun 15, 2015

Gabriella439 commented Jun 15, 2015

jnievelt commented Jun 15, 2015

johnynek commented Jun 15, 2015

Gabriella439 commented Jun 15, 2015

Add a non-sparse MapMonoid/Semigroup/Group/etc... #450

Add a non-sparse MapMonoid/Semigroup/Group/etc... #450

Comments

johnynek commented Jun 12, 2015

johnynek commented Jun 12, 2015

ianoc commented Jun 12, 2015

jnievelt commented Jun 12, 2015

avibryant commented Jun 12, 2015

jnievelt commented Jun 12, 2015

avibryant commented Jun 12, 2015

jnievelt commented Jun 12, 2015

johnynek commented Jun 12, 2015

Gabriella439 commented Jun 15, 2015

ianoc commented Jun 15, 2015

jnievelt commented Jun 15, 2015

Gabriella439 commented Jun 15, 2015

jnievelt commented Jun 15, 2015

Gabriella439 commented Jun 15, 2015

jnievelt commented Jun 15, 2015

johnynek commented Jun 15, 2015

Gabriella439 commented Jun 15, 2015