Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move typelevel/algebra into cats repo. #2041

Closed
LukaJCB opened this issue Nov 25, 2017 · 38 comments
Closed

Move typelevel/algebra into cats repo. #2041

LukaJCB opened this issue Nov 25, 2017 · 38 comments

Comments

@LukaJCB
Copy link
Member

LukaJCB commented Nov 25, 2017

This was discussed during the cats meeting at Scala By The Bay this year and I think most of us thought it'd be a good idea.

@johnynek
Copy link
Contributor

+1

@kailuowang
Copy link
Contributor

+1 under a different root package or cats.algebra?

@johnynek
Copy link
Contributor

maybe it should be cats algebra. It is a shame when we have two different artifact ids with the same class names inside.

@johnynek
Copy link
Contributor

on second thought... I don't know why I said that. The artifact can be the same: org.typelevel:algebra so I don't see why we would have the artifact ID collision.

So in that case, it is best to just keep it as it is.

@denisrosset
Copy link
Contributor

What is the timeframe for that move? I'd like to discuss in algebra some changes I am making in the Spire hierarchy, so that could impact what goes finally into cats.

@ctongfei
Copy link
Contributor

@johnynek I prefer cats.algebra. Sometimes you have your own package called X and you have a package X.algebra and this sometimes causes path collision -- to refer to Typelevel Algebra you must do _root_.algebra. See use cases like this:
https://github.com/ctongfei/nexus/blob/master/algebra/src/main/scala/nexus/algebra/IsReal.scala#L12:15

So maybe cats.algebra._ is a better choice?

@benhutchison
Copy link
Member

If algebra and cats merge, it’s an opportunity for aligning the syntax and type class definitions, see typelevel/algebra#199

@LukaJCB
Copy link
Member Author

LukaJCB commented Sep 21, 2018

Would we want a separate module without the cats-core dependency, or a "real" merge with the dependency or possibly even same module?

@johnynek
Copy link
Contributor

I think we want a separate module because projects like Spire and Algebird that don’t (generally) have higher kinder type classes.

We could put them in kernel, or make algebra depend only on kernel.

@benhutchison
Copy link
Member

IIRC, Kernel was split out to enable the Algebra project to depend on Cats (and not the other way round). If that was the predominant reason (?), then I prefer condensing them into kernel than adding an intermediate node in the dependency tree.

@johnynek
Copy link
Contributor

Kernel was split out of algebra. Cats used to depend on algebra but for some reason Cats wanted to break that dependency. As a compromise we moved the core abstractions into Kernel.

If we are to move all of algebra into kernel then we are back to the starting place of cats depending on all of Algebra but we have renamed it Kernel.

@LukaJCB
Copy link
Member Author

LukaJCB commented Sep 22, 2018

What was the reason to split algebra into kernel? 🤔 Anyone of the old guard still have some info? @non maybe?

@johnynek
Copy link
Contributor

There is a cats issue where it was discussed. The reason was cats wanted fewer dependencies I think and didn’t care about Rings.

@LukaJCB
Copy link
Member Author

LukaJCB commented Sep 27, 2018

I've been trying to find an issue that accurately describes the actual problem with having the Ring and Lattice family in cats, and I can't really seem to find a good reason anywhere? Is it bloat? Jar size? Compatibility guarantees?

@johnynek
Copy link
Contributor

I think it is just OCD around minimization. This code is very stable and had binary compatibility guarantees much longer than cats did.

@benhutchison
Copy link
Member

benhutchison commented Sep 27, 2018 via email

@LukaJCB
Copy link
Member Author

LukaJCB commented Sep 27, 2018

Personally I see no issue with adding the various rings and lattices, they don't add anything measurable to the jar size and are pretty useful for a bunch of things. It also has the advantage that we can have all the instances in one import.

@ctongfei
Copy link
Contributor

ctongfei commented Sep 28, 2018

I think probably a separate package cats.algebra probably would be better since we could also add Module, Field, VectorSpace, MetricSpace, AffineSpace etc. into it. Merging all of these in cats.kernel seems to be bloating the jar size (and the encoding of these typeclasses might not be stable enough to be put into cats.kernel).

@mpilquist
Copy link
Member

My 2 cents: merge algebra, kernel, core, and free in to a single jar.

@johnynek
Copy link
Contributor

Merging artifacts but not introducing a new package name can cause annoying binary issues: the old jar and the new jar get on the path (since the old jar isn’t evicted because it doesn’t exist anymore).

I’d rather rename the types in algebra if we merge: cats.algebra, but I wouldn’t at this point merge kernel and free due to the above issue (unless we renamed the types so there is no chance for confusion, e.g. deleting cats.kernel/cats.free and putting them in cats.

@LukaJCB
Copy link
Member Author

LukaJCB commented Sep 28, 2018

Agreed, so how about this?

We merge algebra and kernel and put the algebra classes in a cats.algebra package within the same module. Then at a later date we can also assess wether we want to e.g. merge cats-free with cats-core.
Is this agreeable? :)

@ctongfei
Copy link
Contributor

@LukaJCB I still believe that making cats.algebra a separate package would be better: Typeclass encodings of these are nowhere as stable as cats.kernel. Merging these would be a pain for maintaining the binary compatibility of cats.core module. A separate module would allow us faster iterations of algebra.

@LukaJCB
Copy link
Member Author

LukaJCB commented Sep 28, 2018

@ctongfei you mean a separate module?

Hmm, I'm not fully sure I agree with that. AFAIK algebra hasn't had a single backwards incompatible change in the last 2 years.

@ctongfei
Copy link
Contributor

@LukaJCB Or do we care about the cats.core module jar size? I actually plan to start a bunch of PRs adding new stuff to algebra (Module, VectorSpace, MetricSpace, AffineSpace etc. etc.)

@johnynek
Copy link
Contributor

I think it would make sense to incubate a “Spaces” module outside algebra for a bit. I agree it can be useful but there are some decision decisions (e.g. does a metric space return a double for the metric value, or some other type).

Actually, on the jar size front. We should probably take a look at how big cats core, kernel and algebra are on 2.12 (on the phone at the moment). That might help us think about this with data in hand.

@ctongfei
Copy link
Contributor

Yeah the size of cats.core is a valid concern. We should get some data on that.
Also I think that MetricSpace should be MetricSpace[A, R] where the R = float/double/etc. could be customized.

@denisrosset
Copy link
Contributor

Hohoho, beware of the module/vector spaces stuff.

Noncommutative modules are a thing, so you need left modules and right modules. With metric and normed spaces, you have the problem that the norm type is not necessarily the scalar type: for example, you can have a vector space over the rationals, but the Euclidean norm requires taking square roots. Or you have a complex vector space, and the scalar is complex while the norm is always real.

Up until VectorSpace, I'm pretty confident of the current Spire implementations. For the rest, it's a mixed bag.

I wouldn't put algebra in cats-kernel, but as a separate module it's fine.

I'd love to have a better ring tower in algebra, including the factorization domains, GCD and Euclidean rings and fields. I'd love to include TruncatedDivision (i.e. the sane % operator) in algebra as well.

@ctongfei
Copy link
Contributor

Completely agree with @denisrosset .

@LukaJCB
Copy link
Member Author

LukaJCB commented Jan 22, 2019

Btw, one cool thing we could do with Semirings available in cats-core:

case class MatChain[A](value: Chain[Chain[A]])

sealed trait AltValidation[+E, +A]
case class Valid[A](a: A) extends AltValidation[Nothing, A]
case class Invalid[E](e: E) extends AltValidation[E, Nothing]
  
type ValidatedMat[E, A] = AltValidation[MatChain[E], A]


def validateInt(s: String): ValidatedMat[String, Int] = 
  try(Valid(s.toInt)) catch { 
    case e: Throwable => Invalid(MatChain.one("No number"))
  }
  
def validateBoolean(s: String): ValidatedMat[String, Boolean] =
  if (s.toLowerCase == "true") Valid(true)
  else if (s.toLowerCase == "false") Valid(false)
  else Invalid(MatChain.one("Not a bool"))
  
  
def validateIntOrBoolean(s: String): ValidatedMat[String, Either[Int, Boolean]] =
  validateBoolean(s).map(_.asRight[Int]) <+>
    validateInt(s).map(_.asLeft)
    

val validInt = "42"
val validBool = "True"
val invalid = "Nope"

println(validateIntOrBoolean(validInt)) // Valid(Left(42))
println(validateIntOrBoolean(validBool)) // Valid(Right(true))
println(validateIntOrBoolean(invalid)) // Invalid(MatChain(Chain(Chain(Not a bool), Chain(No number))))

https://scalafiddle.io/sf/EfMFoe6/3

@LukaJCB
Copy link
Member Author

LukaJCB commented Jun 20, 2019

I'd like to move forward with this at some point, I can probably come up with a PR sometime this weekend to get the ball rolling unless there are any immediate objections :)

@denisrosset
Copy link
Contributor

denisrosset commented Jun 20, 2019

I need to write something more detailed, but I'm just back from ScalaDays and have 4 conferences/workshops in July... so here is my take.

Fundamentally, the hierarchy in typelevel/algebra is sound. I wish to split further the step CommutativeRing -> Field by including GCDs and Euclidean division: that's necessary to work with polynomials, and would match the level of detail we have for basic typeclasses (for example, the painful amount of detail in the additive/multiplicative non/commutative semi/groups).

Additionally, I'd love to see Signed and TruncatedDivision closer to the base level. Signed gives the absolute value, andTruncatedDivision gives one modulo operation for which the result has the sign of the divisor; great when indexing arrays as circular buffers (i.e. -3 fmod 2 = 1 instead of -3 mod 2 = -1).

I'm willing to let go of these minor changes if we can evolve this part of cats later.

That said

I think we can have a better design/naming scheme for both typelevel/algebra and spire.

First of all, take a hard look at the hierarchy: do we really need e.g. noncommutative additive structures? Currently, I have no idea which parts are used in the real world. That's a discussion that should involve all the users of cats/algebra/spire.

Second, CommutativeAdditiveMonoid is a terrible name. If we renamed this thing Plus0, it would be self explanatory. Then CommutativeSemiring would be Plus0Times, and CommutativeRing would be PlusMinusTimes1 which is clear and precise (whereas the mathematical definitions vary between subfields).

Or, we may want to get rid of most of the detail, and follow a coarser hierarchy a la PureScript.

Still, we can't argue those points as a matter of principle. What I can do however, is to prepare minimal variants of Spire that demonstrate possible directions, and discuss those variants with the community.

I'd love to have a bubbling community exploring those matters, as when Cats started.

Finally, the transition to Scala 3 will have a major impact on Spire. The library is full of macros, and support for specialization will (probably) be much reduced in Scala 3. Thus, we can't avoid having a hard look at the underlying design.

To conclude

I don't think the current situation is satisfactory -- my take away from ScalaDays is that precise numerics has value for business (esp. in fintech), and also that spire is intimidating which impacts the health of its community.

Outside of our work on cats/algebra/spire and Spark, Scala lacks a good story for numerics. Several ScalaDays attendees complained about the lack of a numpy equivalent.

Our work can establish sound foundations in that direction. But that's a long term goal, and I think it's really worth fighting for. I can start seriously on that after the summer travels.

Short term, I'm treating the current code base as legacy: don't make breaking changes without a clear direction, do the minimal work to get the ball rolling and if parts can be refactored/documented, do it on the side.

@LukaJCB
Copy link
Member Author

LukaJCB commented Jun 20, 2019

I'm not sure there's enough buy-in for a bubbling community panel. I think having the basic algebraic structures available in Cats would be a great first step in reviving interest. They require no macros and complement what's in cats-kernel perfectly IMO.
By basic algebraic structures I propose going with the various lattices and rings not including field.

What you're describing sounds like a very large undertaking and somewhat out of scope for this issue, though I would absolutely love to be part of that discussion! (Is this going to be the eventual Spire 1.0?) I would love to hear what exactly you mean by minimal variants of Spire, does that include typeclasses only?

@denisrosset
Copy link
Contributor

I'll be busy until August; if you want to merge stuff in cats, I'm all for it -- but I'd love that it stays open for improvement.

Starting August, I'll have time to propose PRs to typelevel/algebra first.

As mentioned, the GCD, Euclidean/truncated division, absolute value stuff in Spire complements nicely what typelevel/algebra provides (not speaking of other type classes, there is a lot of historical baggage in Spire that we are slowly chipping at).

@denisrosset
Copy link
Contributor

Oh yeah, and the goal is to get to Spire 1.0 by making sure we can maintain the high value stuff: for example, support for polynomials, root isolation is fantastic (@non and @tixxit are wizards). We need to ramp up the code coverage/documentation for these.

@tpolecat
Copy link
Member

tpolecat commented Dec 9, 2020

@larsrh and I had a quick chat and it seems like we can set this aside for now (since there seem to be some design issues to work out), which would allow Someone™️ to go ahead and get algebra working on Scala 3, which will then unlock Spire (which is likely to be a lot of work so it would be good to get going on it).

@armanbilge
Copy link
Member

Are you still looking for help with this? (With a little guidance) I might be that Someone™️. I've been hacking at a new project that would definitely benefit (essentially, a re-imagining of Spire's Dist as a monad transformer that implements Cats [Effect] typeclasses; thanks to @SystemFw for suggesting this construct to me on gitter a couple months ago :)

@larsrh
Copy link
Contributor

larsrh commented Apr 30, 2021

So far I think nobody has tackled this, so it'd be great if you could work on it.

@armanbilge
Copy link
Member

Done in #3918.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants