Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependency on scalaz.concurrent #321

Closed
pchlupacek opened this issue Feb 17, 2015 · 37 comments
Closed

Remove dependency on scalaz.concurrent #321

pchlupacek opened this issue Feb 17, 2015 · 37 comments

Comments

@pchlupacek
Copy link
Contributor

Hi, this is just an initial idea. I would like to explore if we can remove the dependency on scalaz. Namely this is driven by fact that I would like to have full control of concurrent primitives (like Task, Future, and perhaps Actor and Strategy) in our code and don't be dependent on release cycles of scalaz for these.

What do you think guys? I would like to see scalaz-concurrent in our code and perhaps scalaz stuff to be in separate module of scalaz-streams.

@djspiewak
Copy link
Member

There are some things from Scalaz that we need above and beyond the concurrency primitives. Most notably, abstractions like Monad and Functor, and utilities like Either. We can get those from Scalaz or we can get them from Cats if you prefer, but we do need them.

Having full control over our own primitives would be great though. Task is fantastic and beautiful and the fact that it's sort of tucked away inside of Scalaz has always been a shame. We lose some nice interop properties by having our own Task, but at the same time, if we're looking to cut the dependency, then interop isn't really a goal anymore.

@rossabaker
Copy link
Member

I've been considering this as well, and have scalaz-stream compiling on cats (requires publishLocal of cats). Todo.scala lays out what is needed to make it real.

  • There is the scalaz-concurrent work that inspired this ticket.
  • Most necessary abstractions already exist in cats. Catchable or Nondeterminism don't yet.
  • Either3 doesn't, and probably won't, exist in cats. Shapeless coproducts solve this, but that's another dependency.
  • Cats is missing a few utility methods, like maximum.
  • \/ and friends are aliases, pending final decision on non/cats#189.

It's not a small job, but it's viable. The big decision, as I see it, is whether the base abstractions are sourced from cats-core or scalaz-core.

See also:

  • non/cats#21
  • non/cats#32

@bryce-anderson
Copy link
Contributor

I'm also interested in a scalaz-stream that unhindered when using cats. I don't have any immediate technical comments that haven't already been discussed.

@pchiusano
Copy link
Contributor

I'd rather just cut the dependency entirely, like @mpilquist has done for scodec. I do not want to move scalaz-stream from depending on scalaz to depending on cats (see below). Instead, I'd like to investigate:

  • Pulling Task into scalaz-stream, where we fully control it. I think this is beneficial anyway.
  • Making scalaz-stream (to be renamed) a multi-module project; core will have zero dependencies other than scodec-bits. The instances for Monad etc are handy but aren't necessary internal to the library. There may be some annoying stuff though, like having to convert from \/ to using Either and reimplement some functionality currently in scalaz/cats.
  • Providing modules which depend on core that integrate with scalaz, cats, Structures, and whatever else.
  • OR, don't make this repo multi-module, just have it be core, and have separate repos for the integrations with various typeclass providers. This might be better, since then the different integrations can have their own release schedules. Actually, I think I like this much better.

The reason I'd rather just cut the dependency entirely is that I don't really want to pick sides in this whole mess of multiple projects competing to provide the same functionality. To the extent it is possible, I'd like whatever library "wins" to do so on its technical merits, not because of network effects by random projects like scalaz-stream choosing this or that library as their dependency. The only reason I've kept scalaz-stream depending on scalaz is when I last looked into it, it seemed pretty annoying to change. But, my head probably wasn't in it given how fried I was from dealing with scalaz drama, so I do think it's quite possible.

That said if @pchlupacek and/or @rossabaker would like to investigate breaking the dependency entirely, I would heartily endorse that effort! :) I myself do not have the bandwidth to work on it right now, though.

Now, the hardest part will be figuring out what to rename the project... :)

Just to clarify, I am totally fine continuing to depend on scodec-bits. That is a rock solid and stable dependency.

@mpilquist
Copy link
Member

@pchiusano I am very happy to see this -- multiple repositories, one core along with one for each integration, sounds great. I'm happy to help with the conversion / dependency breaking.

@pchlupacek
Copy link
Contributor Author

@pchiusano I would likely consider if we can't have core with 0 dependency. I like s-codecs, but perhaps having the bytes-xxx project as sort of module, may be more consistent. I understand that this is used only in io, so perhaps we may have io project that depends on s-codes.

@pchiusano
Copy link
Contributor

Yes, I should say, anyone with an interest in this is welcome to help out, not just @pchlupacek and @rossabaker. :)

As a next step, I'd recommend that someone volunteer to take the lead in creating a new branch which removes scalaz as a dependency, and get a complete inventory of all the stuff missing. I'm guessing this branch will be in a noncompiling state for a while, but I'd still push the WIP in case it is possible to parallelize the work. (Like, we need sequence defined for Either, and these six other utility functions...)

@pchiusano
Copy link
Contributor

I think I'd be okay with a multimodule project, with all the io stuff seprated, and it could depend on scodec-bits, with core having literally zero deps. But I don't have strong feelings either way. I feel pretty comfortable with the scodec-bits dependency just because it is so stable and slow-moving. If we were to do the separate io module, I'd do that as a separate effort from removing the scalaz dependency - they are orthogonal.

@pchlupacek
Copy link
Contributor Author

yeah, was really just a proposition I am ok with this as it is as well. Is like removal of dependency on scala almost :-)

@rossabaker
Copy link
Member

I'm definitely up for exploring the typeclass-lib-agnostic approach. It sounds wonderful on paper, but I envision many important functions will be exiled to duplicated across support modules (including Process.run!). Still, sharing any part of the core is better than a fork.

I will begin spiking at https://github.com/rossabaker/scalaz-stream/tree/topic/lean-core. Watch for either a PR or an admission of defeat soon. :)

@rossabaker
Copy link
Member

The further I go toward removing scalaz-core in #322, the less appealing it becomes. It already requires a few specializations, and looks to require a few more, including interpreters for Process[Task, _]. One is quickly reminded why we have core type class dependencies. Also, the addition of new functionality that depends on type classes (like a new Process1) will not easily be enjoyed by those on the other side of the fence.

The library already supports Scalaz 7.0 and Scalaz 7.1 with git branches. My topic/cats branch also doesn't diverge much, and could be made more source compatible with syntax to reconcile differences such as pure vs. point. If we cut the scalaz-concurrency dependency, we could support any core library for which someone steps up to maintain a branch. It's essentially a second dimension of cross build, which sucks, but we already do something like it. We still have to "pick a winner" for master, but new additions that don't use exotic type classes will be useful in all branches.

A third approach would be to define our own core typeclasses and then have scataz-like modules to bridge to Scalaz, Cats, etc. The last thing I want is another monad trait in Scala. Instead of underabstracting like #322, it's overabstracting, but I'll put it on the table.

@djspiewak
Copy link
Member

@rossabaker @pchiusano As noble of a goal as it is to have a completely dependency-free core and to avoid "picking a winner" in the Cats vs Scalaz deathmatch, I think in this case it might be a bit of a fool's errand. As Ross said, there's a reason why we have core typeclass dependencies in the first place.

Now, I can think of a couple of ways that we can make it manageable to publish a scalaz-stream artifact against both cats and scalaz, even without the current git branching scheme (which I'm not a fan of). I'm almost positive I can contort SBT into building multiple artifacts with different source directories. The majority of our sources can be in src/main/scala, and all of our cats/scalaz dependencies can be done through type aliases which are implemented in src/main/scala-scalaz7, src/main/scala-scalaz71 and src/main/scala-cats, respectively. It's not going to be the prettiest thing in the world from a build specification standpoint, but I'm pretty certain that it's possible.

Beyond that… I'm not sure that it's possible long-term to avoid "picking a winner" in the cats vs scalaz thing. Network effect is everything for any open source project, but especially an upstream framework. Frameworks don't win on technical merits; they win on community. That's just the nature of software, because it is in fact the nature of the people who write the software. As much as I'd like to see Cats succeed, I don't mind scalaz-stream having a hard dependency on scalaz. I would certainly rather have that than have to deal with crazy contortions in dependency resolution and/or specialized function implementations to avoid said dependencies.

So my preferences, in order, would be the following:

  1. Implement SBT voodoo to depend on ALL THE THINGS via source directory splits
  2. Stick with the hard dependency on scalaz, but extract Task into our own subproject so that we can fix stuff (e.g. interrupt semantics)
  3. Ditto the above, except for cats

The main reason that 3 comes below 2 is because we're already hard depending on scalaz, Task is part of scalaz, and in general the status quo is safer and lower risk.

My point is really that I don't think a dependency-free core is feasible. We can either pick a winner, or we can perform SBT magic to side-step that entire question, but I don't think we can shave our heads and withdraw from the World of the Abstracted.

@mpilquist
Copy link
Member

Cross building could actually be worse for the community unless each cross-built JAR puts the types in discrete packages. Otherwise, we risk incompatibilities with downstream libraries -- imagine, for instance, http4s using scalaz-stream-scalaz and scodec-stream using scalaz-stream-cats, and an app that uses both.

@djspiewak
Copy link
Member

Cross building could actually be worse for the community unless each cross-built JAR puts the types in discrete packages. Otherwise, we risk incompatibilities with downstream libraries -- imagine, for instance, http4s using scalaz-stream-scalaz and scodec-stream using scalaz-stream-cats, and an app that uses both.

I raised this point on the scalaz mailing list back when forking was proposed by Kmett. Ultimately, either Scalaz or Cats must win. Completely and utterly. If they both maintain a following but neither reaches "critical mass", then the community has the worst of all possible worlds.

@tonymorris
Copy link

No, really, neither "must win." In fact, they are not even competing. It is ludicrous to continue suggesting so.

@pchiusano
Copy link
Contributor

Tony, tone it down please. We're having a discussion. Calling people's
opinions ludicrous is unhelpful.

Anyway this is meant to be a discussion about what the scalaz stream
project should do, and I'd like to keep it focused on that.

My feeling is that if the dependency can't be broken easily I'd rather stay
with a scalaz dependency for the time being. Ross, thanks for your work,
I'd like to review this week and see if there's maybe some other decent
path forward. Also if other people have ideas please do pipe in!

Michael, your point about cross builds is a good one.

Honestly I can't really see myself wanting to build against multiple
dependencies. I'd rather have zero dependencies, or just pick one. If
someone would like to maintain a fork against a different dependency, then
that is of course their right to do so.
On Sun, Mar 1, 2015 at 7:49 PM Tony Morris notifications@github.com wrote:

No, really, neither "must win." In fact, they are not even competing. It
is ludicrous to continue suggesting so.


Reply to this email directly or view it on GitHub
https://github.com/scalaz/scalaz-stream/issues/321#issuecomment-76642273
.

@rossabaker
Copy link
Member

If we factor out scalaz-core dep at an accepted price now, I still see that cost steepening over time. The more anemic core makes it harder to build higher level modules. We see this effect already in text and tcp, struggling with the exile of repartition and translate from core for lack of foundational type classes.

The strategy that @djspiewak lays out is not uncommon in macro projects: src/main/scala is conditionally compiled with scala_2.10 and scala_2.11. It imposes a structural quarantine of the variable code, which is less flexible but easier to maintain than the git model. I'm not sure how to get the packaging @mpilquist suggests without extra hacks.

This extra dimension of cross building is suboptimal and frustrating, but this is where we are in early 2015. I see brilliant people bunkered down on both sides and still others straddling the fence. These strategies aren't desirable, but in this environment, I see them costing far less than a bifurcated community.

@jedws
Copy link
Contributor

jedws commented Mar 2, 2015

I'm not sure that this complete win is either particularly desirable
or achievable. The two projects are not even really comparable (yet) and
with Cats still to have any released artefacts the discussion of it maybe
winning is currently hypothetical at best.

As far as the community totally adopting one or the other, the events of
last year were enormously divisive, and some of the result of that would
mean there is very little likelihood of that happening any time soon.

If there is a contest, as Paul said earlier it needs to be made on
technical grounds as well as convenience. Currently the benefit of Cats
seems to be that no-one else could possibly be using it, so we won't get
version conflicts. While version conflicts are extremely painful in Scala,
this is a short-term argument; presumably other project will start using it
and it being a younger library it is more likely to have a more rapid
release schedule, so this benefit recedes in inverse proportion to its
popularity.

On 2 March 2015 at 08:36, Daniel Spiewak notifications@github.com wrote:

Cross building could actually be worse for the community unless each
cross-built JAR puts the types in discrete packages. Otherwise, we risk
incompatibilities with downstream libraries -- imagine, for instance,
http4s using scalaz-stream-scalaz and scodec-stream using
scalaz-stream-cats, and an app that uses both.

I raised this point on the scalaz mailing list back when forking was
proposed by Kmett. Ultimately, either Scalaz or Cats must win.
Completely and utterly. If they both maintain a following but neither
reaches "critical mass", then the community has the worst of all possible
worlds.


Reply to this email directly or view it on GitHub
https://github.com/scalaz/scalaz-stream/issues/321#issuecomment-76632353
.

@mpilquist
Copy link
Member

@rossabaker To be clear, I'm not advocating for cross building. I'd much prefer to see this library with zero dependencies and compatibility modules.

@rossabaker
Copy link
Member

Also to be clear, I am not advocating an exclusive or immediate switch. My branch way up in comment three is exploratory, so we downstream library authors and application developers understand and can plan to deal with the upstream situation. Besides the great schism, we have the production Scalaz 7.0 and 7.1, the imminent and binary incompatible Scalaz 7.2, and an active prototype of a source incompatible Scalaz 8.0.

I would also strongly prefer zero dependencies. Ideally, similar techniques could then be used in downstream libraries like http4s and doobie and remotely, and build an interoperable, minimally opinionated stack. But if that came without costly tradeoffs, I'm not sure why we'd have type classes at all. Now, scodec-bits did it. My question is how was it achieved there, and why does it apparently hurt here? Are we overlooking useful techniques, or was that just a simpler problem?

@pchlupacek
Copy link
Contributor Author

folks can we make a list of MUST to have TypeClases etc. in core library? I mean these that the core implementation depends on?
I think concurrent stuff is pretty easy to define, but I am kind a struggling to see if we really have that much usage of scalaz stuff that we really cannot put in scalaz module.

@djspiewak
Copy link
Member

folks can we make a list of MUST to have TypeClases etc. in core library?

All of the interpreters either need to be built against a specific type (e.g. Task), or must have an array of typeclasses to provide operations on the otherwise abstract type constructor. Catchable and Functor seem like the obvious ones, but I think Monad might be needed in some cases. Monoid is needed as well with the current implementation.

@rossabaker
Copy link
Member

The interpreters are the big one. There are a couple traverse_s in core. tcp benefits from ~>, and text benefits from Semigroup.

@pchiusano
Copy link
Contributor

What do folks think about just specializing all the interpreters to Task?
Obviously, it's less flexible, but it would mean we could avoid having to
duplicate a bunch of typeclasses, and it seems like it might be the only
way to get code dependency free. Honestly, I cannot recall a time where
I've had to run a Process[F,_] for any F other than Task (or
Nothing).

We would definitely still need translate, and ~>, since Task will be
acting as the 'final object' that everything gets compiled to. But
duplicating one 3 line class doesn't seem like a big deal. It's a shame
Scala doesn't support rank 2 types natively... but anyway.

We could also if we really want just use ~> to accept unit and attempt
as a first class values, again without having to bring in any typeclasses.
unit : Id ~> F, etc. bind would need some two type parameter version of
~> I guess. This would be hideous, but it can be wrapped nicely for the
common case of running Task. And if you want to run something other than
a Task stream, you have to do something ugly, but at least it is possible.

On Mon, Mar 2, 2015 at 12:59 PM Daniel Spiewak notifications@github.com
wrote:

folks can we make a list of MUST to have TypeClases etc. in core library?

All of the interpreters either need to be built against a specific type
(e.g. Task), or must have an array of typeclasses to provide operations
on the otherwise abstract type constructor. Catchable and Functor seem
like the obvious ones, but I think Monad might be needed in some cases.
Monoid is needed as well with the current implementation.


Reply to this email directly or view it on GitHub
https://github.com/scalaz/scalaz-stream/issues/321#issuecomment-76763401
.

@mpilquist
Copy link
Member

@pchiusano +1 on specializing interpreters to Task.

@djspiewak
Copy link
Member

While I'm generally in favor of abstraction, so much of the useful stuff in scalaz-stream is already specialized on Task (in particular, everything associated with concurrency), so it's not really much of a loss. In my experience, if you're using Process, you're almost certainly using Process[Task, _]. So… specializing on Task would not be the end of the world, especially if we can gain other (ideally significant) benefits from doing so.

@rossabaker
Copy link
Member

It's not just interpreters, but it is mostly Task:

  • runFoldMap requires a Monoid. The others can all be specialized for Task and IndexedSeq, which is not a tremendous loss.
  • handle and partialAttempt also require specialization due to Catchable.
  • gatherMap/gather/sequence require specialization due to Nondeterminism.

We'd also lose generic Channel.mapOut and Sink.toChannel syntax for lack of a Functor, but those could also be specialized on Task, I suppose.

@pchiusano
Copy link
Contributor

I think handle and partialAttempt are unnecessary. They were introduced
before onHalt / onFailure. I'm guessing they can be implemented in
terms of onHalt, or just removed.

re Channel.mapOut and Sink.toChannel, I'd like to change the representation
of Channel and Sink at some point. It should have been type Channel[F,A,B]
= Process[F, A => Process[F,B]], which eliminates the need for the Functor.
It's also somewhat awkward that channels have to return exactly one value
for each input.

I'd probably just make runFoldMap take the binary operation and identity as
regular arguments. Totally reasonable, and if the caller has a monoid, m,
they can still call it easily enough.

I consider Nondeterminism to be a failed experiment, so I don't mind
specializing there.

On Mon, Mar 2, 2015 at 3:01 PM Ross A. Baker notifications@github.com
wrote:

It's not just interpreters, but it is mostly Task:

  • runFoldMap requires a Monoid. The others can all be specialized for
    Task and IndexedSeq, which is not a tremendous loss.
  • handle and partialAttempt also require specialization due to
    Catchable.
    • gatherMap/gather/sequence require specialization due to
      Nondeterminism.

We'd also lose generic Channel.mapOut and Sink.toChannel syntax for lack
of a Functor, but those could also be specialized on Task, I suppose.


Reply to this email directly or view it on GitHub
https://github.com/scalaz/scalaz-stream/issues/321#issuecomment-76801315
.

@runarorama
Copy link
Contributor

No thank you! We use Process extensively with other free monads (that may or may not eventually compile to Task). Specializing to Task would mean we would have to fork this library.

@tonymorris
Copy link

@jedws The Scalaz project is motivated by very different aspirations and goals to the cats library. It boggles my mind that we are talking about "competition." A library includes a Functor trait and now it is competing? Is that it? How weird.

I don't mind rewriting a stream library; if only to get away from the bloody nonsense!

/rant

@rossabaker
Copy link
Member

Specializing to Task does not preclude other interpreters. I don't see why the existing monad/catchable interpreters couldn't still exist in Scalaz support.

@pchiusano
Copy link
Contributor

Hang on, let's make sure we are talking about the same things here. Just to clarify, we will never specialize Process[F,A] to Process[Task,A]. So we won't change:

trait Process[F[_],A]

to:

trait Process[A]

That would be a huge step backward. Tons of code relies on the ability to use different F, including code internal to scalaz-stream itself, scodec-stream, and I'm sure tons of user code. So that will not change, @runarorama not sure if you were concerned about that.

We are just contemplating whether the runner(s) of Process, like runLog, could be specialized, at least in core. So rather than runLog working for any F with a Monad[F] and Catchable[F], it would be defined just for a Process[Task,A]. Also, as @rossabaker points out, there could be Monad/Catchable-generic versions of the various runners in the scalaz binding.

The reason I suspected specializing the runners to Task might not be much of a limitation in expressiveness is that if you have a monad, G, that you are using for the F in Process[F,A], you can sometimes (often? always?) either run the Process[G,A] to get a G[Blah], and then convert the G to a Task, or you can call translate on the Process[G,A] to get a Process[Task,A], and then run that. @runarorama or anyone else, do you have a concrete G where that doesn't work out, or a general class of examples? If so that would be really useful to think about. Since G also has to be Catchable for all the runner functions, it's going to have to be something Task or IO-like.

The only examples I could think of are basically things that are isomorphic to Env => Task[A], which can be handled via translate (this is the strategy used in scodec-stream and in the tcp module), which can bind Env. But perhaps I am just not very creative at coming up with examples. :)

@pchlupacek
Copy link
Contributor Author

Well, I think for runners we can introduce type class ProcessRunner and in core library provide Task instance. Whereas others can live in scalaz/xxx bindings?

i.e.

def runLog(implicit runner:ProcessRunner[F,O]):F[IndexedSeq[O]] = runner.runLog

object Task {
  implicit def runner[O]: ProcessRunner[Task,O] = ??? 
}

@pchiusano
Copy link
Contributor

Isn't ProcessRunner just going to be basically Monad + Catchable,
though? Either that or all the ProcessRunner implementations duplicate
the same logic... which is rather error prone.

On Mon, Mar 2, 2015 at 11:39 PM Pavel Chlupacek notifications@github.com
wrote:

Well, I think for runners we can introduce type class ProcessRunner and in
core library provide Task instance. Whereas others can live in scalaz/xxx
bindings?

i.e.

def runLog(implicit runner:ProcessRunner[F,O]):F[IndexedSeq[O]] = runner.runLog
object Task {
implicit def runner[O]: ProcessRunner[Task,O] = ???
}


Reply to this email directly or view it on GitHub
https://github.com/scalaz/scalaz-stream/issues/321#issuecomment-76886066
.

@rossabaker
Copy link
Member

One might summon a ProcessRunner from a Monad and a Catchable. I admit to not having explored this technique outside a trivial REPL example: https://gist.github.com/rossabaker/bf76b4d3449636a18c12

@pchlupacek
Copy link
Contributor Author

@pchiusano yes, exactly. However we do not have monad + catchable in stream core, that's why we can introduce this. I don't think so we need Monad in streams core, but perhaps Catchable is reasonable TypeClass to include in streams core.

@pchiusano
Copy link
Contributor

Closing. This is done in new design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants