Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make testing IO programs easier #263

Closed
7 tasks
jdegoes opened this issue Sep 24, 2018 · 79 comments
Closed
7 tasks

Make testing IO programs easier #263

jdegoes opened this issue Sep 24, 2018 · 79 comments
Labels
enhancement New feature or request
Milestone

Comments

@jdegoes
Copy link
Member

jdegoes commented Sep 24, 2018

Testing IO programs is not easy right now. One must either use heavy-duty, type-unsafe, and dysfunctional mocking machinery from the world of Java, or one must invent a lot of custom and complex machinery for use with final tagless style.

Testing IO programs should be easy, which means that it should be possible to make programs that are polymorphic in the effect type, so that either IO may be used, or a hypothetical TestIO.

This ticket tracks the work involved in order to reach this goal.

  • Identify the minimal set of operations necessary to implement the rest of IO (likely smaller than the case classes that extend IO)
  • Abstract over these operations in an Effect[F[_, _]] type class, and create an instance of the type class for IO
  • Create EffectSyntax[F[_, _]] class, which can add methods onto any Effect: F. Importing scalaz.zio._ should bring the syntax into scope automatically, without requiring additional imports.
  • Move all non-minimal methods of the IO class into EffectSyntax[F[_, _]], and ensure no code is broken.
  • Create EffectFunctions class, which can add top-level helper functions for any Effect-like data type.
  • Move all functions of IO object into EffectFunctions[F[_, _]], have the IO companion object extend EffectFunctions (specialized for IO), and ensure no code is broken
  • Use whatever Scaladoc tricks are necessary to make things appear in obvious places

There are a few details:

  1. First, Effect cannot include sync or async, since these methods are incompatible with testing code. However, they can include all other primitive operations that do not wrap effectful code, including asyncIO / asyncPure.
  2. Second, some structures will have to be generalized, most notably Fiber, which will now have to be parameterized over F[_, _]; probably Async, which can hide IO values; and maybe Ref (see next point).
  3. Third, it's possible Ref will have to go into the core set of Effect operations, e.g. def newRef[A](a: A): F[Nothing, Ref[F, A]]. This might be enough to make Promise and then everything else polymorphic in the effect type, which is a desirable goal of this ticket (if not necessary since it can always be done later if difficulties arise).

Once all these steps have been completed, then it will be possible to introduce TestIO data type, and the user won't have to implement the numerous methods that make IO actually useful. It will only be necessary to implement a few key methods, and of course, we will provide TestIO separately in a later ticket, to make the task even easier.

/cc @mmenestret

@jdegoes jdegoes added the enhancement New feature or request label Sep 24, 2018
@jdegoes jdegoes added this to the 1.0.0 milestone Sep 24, 2018
@edmundnoble
Copy link
Contributor

I don't see what an Effect type class would actually do for testing. One thing that would certainly be nice is a testz harness for zio, which zio should probably implement itself to avoid circular dependencies.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 24, 2018

@edmundnoble A user should be able to write code over IO, which interacts with the real world, and over TestIO, which does not interact with the real world. That's not possible without subtyping or type classes.

@edmundnoble
Copy link
Contributor

Can you be specific about what would be tested exactly about code running in TestIO? Concurrency, error handling? Perhaps the best thing to do would be to have a different interpreter for IO.

@neko-kai
Copy link
Member

So, basically, cats-effect with blessed primitives and without Sync at the bottom?

Note that one doesn't have to stoop down to using Java libraries for test doubles – we've found it pretty easy to test IO with distage

And tagless final situation isn't so bad, quantified implicits can lift nearly all the existing tagless final machinery for F[_] kind to ZIO – the only gap is that MonadError isn't very useful, as it cannot change the error type after catch.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 27, 2018

@edmundnoble A different interpreter doesn't help, because people are using IO.sync, IO.async, which interact with the real world and which cannot be mocked (not easily, anyway).

@Kaishh The module approach is interesting. I'll work up an alternative soon.

Quantified implicits looks interesting, too, and could make a nice addition to interop. 😄

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

@jdegoes could you give us a hint what do you dislike in our approach? At the moment we have several services in production and we use type-level code with cats/zio. And things are going well.

To get some impressions regarding our tech you may check a demo repo: https://github.com/pshirshov/izumi-workshop-01

Also there is an interesting example which I wrote in 10 minutes during my workshop: https://github.com/pshirshov/izumi-workshop-01/blob/live-session/app/launcher/src/test/scala/com/github/pshirshov/izumi/workshop/w01/ComplexTest.scala#L91 , note that (1) test suites may be polymorphic on the monad, (2) that example uses dynamic module resolution (aka plugins) but there is an option to keep everything static.

Also you may find some slides here (there is some observation of the DI and our answer to microservice/monolith problems): https://github.com/7mind/slides/blob/master/02-roles/target/roles.pdf

To be very short our approach is based on:

  1. Late binding in form of a planner and an interpreter
  2. Garbage collection on plans
  3. Good integration with scala typesystem

This allows us to

  1. Have dynamic and static DI
  2. Avoid manual context management and unneccessary instantiations
  3. Write decoupled modular apps using type-level code

So we have a huge performance boost. (GC makes an incredible contribution, DI with GC is as cheaper for engineer than typical one as managed memory cheaper than unmanaged.)

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

@pshirshov I actually quite like it.

This ticket is to essentially create BifunctorIO, which you had to create and maintain on your own because it did not exist in ZIO. I think something like this will end up in most applications of DI or testing, so it may as well live inside ZIO.

The particular driving motivation behind this ticket for me is to make it easy to create polymorphic code that, in production, interacts with external, effectful systems; and in unit tests, simulates interaction with external systems. DI can be used to solve part of this problem (namely, the wiring part; and many others besides, of course); but currently the polymorphic part of the problem requires the user create their own abstraction layer around IO, similar to BifunctorIO.

In discussions with @edmundnoble, it occurs to me there's another way to solve the problem that does not require polymorphic machinery.

Let's say we have:

def getPage(url: URL): IO[HttpError, Html]

We have code that uses this to implement a web crawler:

def crawl(roots: Set[URL], ...): IO[Exception, Unit]

We want to test crawl without actually interacting with the Internet. So we move to final tagless:

trait HttpClient[F[_, _]] {
  def getPage(url: URL): F[HttpError, Html]
}
def crawl[F[_, _]: HttpClient](roots: Set[URL], ...): F[Exception, Unit]

This is not a sufficient constraint for crawl, we need something like BifunctorIO:

trait HttpClient[F[_, _]] {
  def getPage(url: URL): F[HttpError, Html]
}
def crawl[F[_, _]: HttpClient: BifunctorIO](roots: Set[URL], ...): F[Exception, Unit]

Now we have a polymorphic method that can be used in production and can be tested. There remains a few problems: first off, the polymorphic machinery is not small and complicates explaining this to a user; second, it's not very easy to write a test IO (which is actually the subject of #264).

The question is, do we need to go final tagless in order to achieve these benefits?

What if instead we just changed HttpClient:

trait HttpClient {
  def getPage(url: URL): IO[HttpError, Html]
}

Now crawl doesn't need to be polymorphic, it just needs HttpClient:

def crawl(client: HttpClient, roots: Set[URL], ...): IO[Exception, Unit]

The advantage is that we don't need any polymorphic machinery, nor do we need what are essentially lawless type classes; the disadvantage is that due to lack of polymorphism, crawl can sneak any effect it likes into its execution; so therefore it requires programmer discipline to ensure only "benign" (testable) effects are utilized in the implementation.

The approach could probably be improved. HttpClient has poor composition properties. If crawl needs a clock, then we must write:

def crawl(client: HttpClient, clock: Clock, roots: Set[URL], ...): IO[Exception, Unit]

One could improve composition by introducing modules:

trait HttpService {
  def httpClient: HttpClient
}
trait ClockService {
  def clock: Clock
}

Now services compose using intersection types, so one can write:

def crawl(services: HttpService with ClockService, roots: Set[URL], ...): IO[Exception, Unit]

This is probably very easy to teach to beginners when they run into the question, "How can I test some ZIO code I wrote without actually interacting with the real world?" It doesn't require any third-party libraries or any machinery, just basically convention and discipline.

The question is whether or not that's a good thing to do, or whether it's better to introduce BifunctorIO (AKA Effect in this ticket) and to teach polymorphism from the very beginning.

A final option I've considered is modularizing IO. In this world, IOModule contains the IO type. So all IO programs will have to pass around an IOModule. e.g.:

class MyProgram(ioM: IOModule) {
  import ioM._

  val io = IO.point("Hello world")
}

In this world, IOModule can contain an F[_] that encodes the type of leaf effects possible (it's a sum type of possible leaf operations, like the F[_] in a freer monad). In production code (IOLive), this would be either SyncEffect or AsyncEffect. In testing code (IOTest[F]), it could be operations such as GetUrl.

Running an IO would require a way to transform from F into some set of runtime operations.

The module way of programming is very bulky and cumbersome and not really idiomatic FP Scala. On the other hand, it lets you reuse a lot of machinery (you could then roll your own IO monads with any behavior you want, cheaply and without much ceremony).

Overall, all things considered, and having seen the existence of BifunctorIO "in the wild", it seems like introducing a type class to describe IO and encouraging polymorphic code may be the most expected way of solving this problem.

@edmundnoble
Copy link
Contributor

You don't need to use final tagless with type classes. Just pass around an HttpClient[F].

Furthermore I have still not seen any code that requires IO to be mocked for testing. @alexknvl and I have talked about one situation where this would be useful: to test that concurrency is working properly, we could interpret an IO into a statement in linear temporal logic, and test it against predicates. Predicates like: these two things should run concurrently and before this other thing, etc. async and sync not being modelled by literally executing their effect is irrelevant.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

@jdegoes aha, I see. I just wish to say that these are bit different matters. Our DI has no contradictions with yours and these approaches would greatly work together (as well our DI works nice with final tagless)

Would you like a chat about your needs? We would be happy to contribute our technology into scalaz. This would benefit us both. Scalaz folks would get best-in-the-world DI technology based on an interesting theory (in fact DIStage was just a first practical illustration of the theory behind). We would get recognition - which is very important for us - we are a small company (very small one) struggling for adoption of our tech 😄

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

You don't need to use final tagless with type classes. Just pass around an HttpClient[F].

Sure, that still requires a type class for IO though.

Furthermore I have still not seen any code that requires IO to be mocked for testing.

Whether you want to call it "mocked" or not is beside the point, though: the main point is that getUrl of HttpClient may not actually interact with the real world in a testing environment; which implies polymorphism; which implies the existence of something like Effect[F] / BifunctorIO[F].

@edmundnoble
Copy link
Contributor

edmundnoble commented Sep 28, 2018

That is just a lot of non-sequiturs. You don't need a type class for IO to have a method polymorphic over F[_]. Polymorphism is taking in an HttpClient[F] for an abstract type F[_]. You haven't proven your point here, you've just written a lot. You also ignore the real case for mocking IO, which is to test concurrency. Make a point here, don't rely on me guessing.

Also, @pshirshov, I doubt that we need or would benefit from a DI library in Scalaz.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

@edmundnoble I don't care about "mocking IO", I care about mocking hunks of code that users pass into IO.sync / IO.async. I don't care if the underlying implementation of "mock IO" uses IO or not. That's irrelevant.

Please prove that you can write a polymorphic crawl that takes an HttpClient[F] without also requiring a type class (or a dictionary) providing IO-like methods (such as flatMap, map, and more importantly, redeem / attempt).

You can't do that. You can do nothing with HttpClient[F] except getUrl, which gives you back F. What are you going to do with that F? You can't do anything with it unless you have capabilities. A type class or dictionary gives you capabilities; since crawl requires sequencing, error handling, error recovery, etc., that means you need a type class over its "effectful" capabilities. If you don't believe this, I challenge you to sketch out crawl or even a simpler example.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

@pshirshov Yes, I agree they are quite compatible and I'd be happy to chat about ways we could work together. Appreciate your team's contributions to the library.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

@edmundnoble

Also, @pshirshov, I doubt that we need or would benefit from a DI library in Scalaz.

This is not your usual DI. You may call it a "hybrid module system", for example. If you check or try it - I'm sure you would agree that it's far beyond any other "classic" DI and provides unprecedented flexibility while keeping your code typesafe 😄

You also ignore the real case for mocking IO, which is to test concurrency.

In our case you may use a typeclass or... just use whatever else you want. It wouldn't break anything.

You haven't proven your point here, you've just written a lot.

I'm not sure what exactly I have to prove. My point is: a good DI mechanism would complement type-level approaches very well so I'm showing what we do with zio/cats in our projects and proposing to contribute some stuff. Nothing else.

@jdegoes happy to know that you are okay to chat. I'll send you a quick email in case you wouldn't mind.

@edmundnoble
Copy link
Contributor

edmundnoble commented Sep 28, 2018

def crawl[F[_]: Monad](client: HttpClient[F], roots: Set[URL], ...): F[Unit] = {
  roots.toList.traverse_(client.getUrl)
}

This is all you've provided. I can adapt the signature exactly as in here. You haven't provided a body for crawl, so I don't know what capabilities you actually want. Furthermore, there are issues with having a type class over two-parameter type constructors: you can't use transformers at all anymore, and code must run in IO. At that point, the abstraction is useless, because there can be only one IO monad. Again, write some code that runs with your abstraction that can't be written otherwise.

@pshirshov I have looked at the DI framework you're showing. Dependency injection is never a tool, and always a framework. We pass arguments to functions here. Scalaz's core philosophy is fundamentally against DI. I don't know what @jdegoes is thinking.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

Dependency injection is never a tool, and always a framework.

In our case it's a non-invasive modular tool which may emit code without any need in runtime support (static API is not published in full yet though we are working on it). In case you point me to our @Id thing - it's a StaticAnnotation and instead of using it you may tag your keys with types. Moreover you may throw out our introspection strategy so drop support for @Id

Scalaz's core philosophy is fundamentally against DI.

Is it a dogm? For example we brought 50% performance boost (conservative estimation) to our customer by complementing cats and zio with our DI and bringing some other things. This stuff is already in production so these aren't just words. And we didn't loose any single bit of safety.

I guess the "DI" term is spoiled and I should say "module system"

@edmundnoble
Copy link
Contributor

I really don't care about the runtime parts, could be compile-time only and it'd be the same. It's not a dogma, it's just what DI is; overcomplication. Passing arguments to functions should always work, with the sole exception of type classes (which have proven themselves). I'm kind of amazed that the code benefited performance-wise from DI. That to me indicates severe architecture problems.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

it's just what DI is; overcomplication

Not true. In case your apps are small - yes. In case your apps are huge - you would have a big dependency graph to manage and transform. What would be the cost of typical refactoring in terms of O(n) where n is the amount of nodes in your graph?

Computers may process graphs better than people, aren't they?

Passing arguments to functions should always work, with the sole exception of type classes

What about 1M LoC?

which have proven themselves

I wouldn't agree with you and in case you think in terms of graphs and graph operations you would be able to prove yourself that the cost of explicit manual parameter passing grows exponentially. It's possible to mitigate it but at the end we are always limited by our brainpower. Computers compute better.

I'm kind of amazed that the code benefited performance-wise from DI.

Not the code.... The development cost and speed...

@edmundnoble
Copy link
Contributor

Programs are graphs. Transforming a dependency graph == passing and transforming values. Dependency injection is an inner platform. If you can't manage all of your values, you should consider avenues other than automating the entire practice of programming.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

This is all you've provided. I can adapt the signature exactly as in here.

This is wrong already, because HttpClient requires a type constructor of F[_, _], and you've provided it a type constructor of F[_]. Moreover, let's assume crawl uses polymorphic errors and therefore requires a redeem method with a type signature that allows changing the error type.

You're stuck.

Furthermore, there are issues with having a type class over two-parameter type constructors: you can't use transformers at all anymore, and code must run in IO.

Nonsense. Code can run in EitherT[F, ?, ?] and infinitely many other data types.

At that point, the abstraction is useless, because there can be only one IO monad.

The number of data types that provide an instance is irrelevant (though will definitely be greater than just 1, due to orphan instances). The only thing that matters is the polymorphism of the client code, which can't exist — and still allow you to take advantage of IO-specific features, such as redeem, race, orElse, etc. — without a type class, dictionary, or equivalent mechanism.

@neko-kai
Copy link
Member

neko-kai commented Sep 28, 2018

@jdegoes
Note that, while BifunctorIO class does exist in our codebase, it's not used for mocking – the API it exposes is too large and includes .sync, it's basically there just not to couple with ZIO too explicitly, the only instance is ZIO itself.

Instead, the dummy/production implementations of Services are injected for tests as in your HttpClient example.

For polymorphism, IMHO given working instances for cats-effect, quantified Monad/Applicative & type-changing Catch class it will be quite possible to write polymorphic code that nonetheless preserves ZIO's bifunctor properties:

import quantified._
import quantified.Quant._
import cats.effect._
import cats.implicits._
import scalaz.zio.IO

import scala.util.Random

def randomAdderService[F[+_, _]: ConcurrentThrowable: BifunctorCatch: Monad2: MonadTerminate2](initial: Int): F[Nothing, List[Int]] = {
  val adder: F[Nothing, Int] =
    syncTerminate[F, Int](Random.nextInt())
      .map(initial + _)

  BifunctorCatch[F].catchAll[Nothing, Throwable, List[Int]] {
    for {
      max <- SyncThrowable[F].delay(Random.nextInt(20))
      fibers <- Traverse[List].traverse(1.to(max).toList)(_ => ConcurrentThrowable[F].start(adder))
      res <- fibers.traverse(_.join)
    } yield res
  } { e => MonadTerminate2[F, Nothing].terminate(e) }
}

def syncTerminate[F[_, _]: SyncThrowable: BifunctorCatch: MonadTerminate2, A](thunk: => A): F[Nothing, A] =
  BifunctorCatch[F].catchAll[Nothing, Throwable, A](SyncThrowable[F].delay(thunk)) {
    exception => MonadTerminate2[F, Nothing].terminate(exception)
  }

type SyncThrowable[F[_, _]] = Sync[F[Throwable, ?]]
def SyncThrowable[F[_, _]: SyncThrowable]: SyncThrowable[F] = implicitly

type ConcurrentThrowable[F[_, _]] = Concurrent[F[Throwable, ?]]
def ConcurrentThrowable[F[_, _]: ConcurrentThrowable]: ConcurrentThrowable[F] = implicitly

//type MonadTerminate2[F[_, _]] = Param[Lambda[E => MonadTerminate[F[E, ?]]]]
type MonadTerminate2[F[_, _]] = Quant[MonadTerminate, F]
def MonadTerminate2[F[_, _]: MonadTerminate2, E]: MonadTerminate[F[E, ?]] = implicitly[MonadTerminate2[F]]

trait MonadTerminate[F[_]] {
  def terminate[A](t: Throwable): F[A]
}
object MonadTerminate {
  def apply[F[_]: MonadTerminate]: MonadTerminate[F] = implicitly

  implicit def monadTerminateIO[E]: MonadTerminate[IO[E, ?]] = new MonadTerminate[IO[E, ?]] {
    override def terminate[A](t: Throwable): IO[E, A] = IO.terminate(t)
  }
}

trait BifunctorCatch[F[_, _]] {
  def catchAll[E1, E, A](f: F[E, A])(h: E => F[E1, A]): F[E1, A]
}
object BifunctorCatch {
  def apply[F[_, _]: BifunctorCatch]: BifunctorCatch[F] = implicitly

  implicit val bifunctorCatchIO: BifunctorCatch[IO] = new BifunctorCatch[IO] {
    override def catchAll[E1, E, A](f: IO[E, A])(h: E => IO[E1, A]): IO[E1, A] = f.catchAll[E1, A](h)
  }
}

@pshirshov
Copy link
Member

If you can't manage all of your values,

I can though why should I in case I may write a tool, cut costs (in US dollars) by half and make customer happy?

you should consider avenues other than automating the entire practice of programming.

In case we speak in form of ultimate imperatives you may wish to consider other platform than github, hmm? Some people may be amazed if some guys speaking in ultimative terms would type their stuff directly into /dev/null.

@edmundnoble
Copy link
Contributor

HttpClient requiring an F[_, _] is the bug.

Also, what are you talking about? What about StateT[F, S, IO[?, ?]]? That's not a type that will be inferred. As well, using EitherT[F, ?, ?] means not being able to access IO primitives that also deal with error types, because the error type will belong to EitherT, meaning you will have no instance.

"The number of data types that provide an instance is irrelevant" is STRICTLY incorrect. It couldn't be more incorrect. I don't even know how to explain this, because it seems completely obvious.

@pshirshov If you can cut costs by getting around programmer incompetence with tooling, go ahead. This is not what scalaz is for.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

Btw: to be clear I don't think DI belongs in Scalaz proper (e.g. scalaz-core or scalaz-zio) but I am personally interested in learning more about DIStage, and a solution in this space sufficiently principled could easily live in the broader Scalaz ecosystem (alongside Scalaz Schema, etc.). In any case, I love to foster collaboration between firms doing interesting work and open source projects I help out with.

@edmundnoble
Copy link
Contributor

I'm aware that you love to foster cooperation. Not all cooperation produces better code. Some code deserves to be deleted or not included in the scalaz umbrella. This I believe is an example.

@neko-kai
Copy link
Member

@edmundnoble
On "never a library, always a framework", note that opinionated tagless final machinery such Haskell's mtl easily becomes as restrictive, or more so, than a runtime framework does. It's no different in that regard – passthrough instances for common transformers are defined ahead of time, out of control of application developer.

It's also obvious that DI has to be a framework. It's acting as a second module system after the compiler is done wiring imports, there's no way to wire "runtime imports" without changing the application flow. The same is the case for implicits though – the moment an inductive implicit appears, programmer's control is reduced to magic rituals.

If you can't manage all of your values, you should consider avenues other than automating the entire practice of programming.

I take it you don't use scalaz-deriving, et al?

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

HttpClient requiring an F[_, _] is the bug.

Nope. If HttpClient had 20 methods, some returning errors of one type, and some returning errors of another type, and some not returning errors at all, then I would want to capture those distinctions precisely.

That you want to pretend all operations can fail equally and with the same error type, and that you are willing to lose the ability to track error recovery statically, is a subjective personal preference that I don't have to live with.

Also, what are you talking about? What about StateT[F, S, IO[?, ?]]? That's not a type that will be inferred. As well, using EitherT[F, ?, ?] means not being able to access IO primitives that also deal with error types, because the error type will belong to EitherT, meaning you will have no instance.

You know my views on transformers. They shouldn't be used in Scala. Type classes should be used and always backed by IO (or equivalent, including potentially newtypes).

"The number of data types that provide an instance is irrelevant" is STRICTLY incorrect.

Nope again. The primary benefit of seeing def foo[F[_]: Monad] is not principally that foo may be called with infinitely many Monad; it's that the type signature tells you the capabilities of F that are required, which allows you to reason about its implementation. The secondary benefit is reuse, which may be exploited for testing.

Seeing crawl[F: HttpClient] (or equivalent) firstly lets you understand the set of effects required by crawl; and secondarily, lets you use a single implementation of crawl for production, and for testing, by either supplying, let's say, a newtype around IO for testing, whose instance does not actually perform real HTTP connections; and by supplying IO for production, whose instance actually does perform HTTP connections. (Or, since we are living in the world of Scala, you can write both instances for IO and choose them at call-site.)

Polymorphism promotes principled reasoning and also makes testing easier. Polymorphism around the full set of capabilities offered by IO requires a type class, dictionary, or module.

That you personally don't care about the full set of capabilities is irrelevant to me.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

@Milyardo

What does a hypothetical TestIO do?

It may apply some aspects to your code:

  1. profiling it, building flamegraphs, recording traces
  2. Intentionally introduce random delays (pseudo-random with fixed seeds of course) in order to simplify stress-testing of timeout-dependant logic
  3. It may omit application of some effects or change their semantic. Why not.

So, regarding the original question - I would start with a set of typeclasses describing neccessary primitives. It would be very convenient on it's own - to use only a minimal set of zio features. Actually we kinda have it already by implementing cats instances but it isn't enough.

@neko-kai
Copy link
Member

neko-kai commented Sep 28, 2018

@emilypi

Are trivial derivations the point? You are proposing a stateful dictionary to load dependencies into your graph. Now reconsider the complexity you're introducing in contrast to that point.

A. State is not required for the general problem at all, the algorithm can run recursively and simply re-instantiate dependencies. There would be no difference for pure modules.
B. Even with sharing, explicit state is not required, e.g., for non-recursive graphs, module system can work by tying the knot into itself – http://ezyang.tumblr.com/post/146124297712/why-rmc-cant-be-defined-coinductively
C. Presence of state is not even relevant as it makes an algorithm neither wrong, nor overly complex – in this case we're talking about nothing but recursively looking up constructors – it's one of the most trivially definable folds.

In short, I would like to, please, be presented with the pointTM of what you're trying to get at. State the problem, or the laws that are broken or the laws that should be created!

Coercion is a non-sequitur. Where are you getting that? Newtyping does not entail coercion - that is an unfortunate side effect of optimization that is only ever allowed in a provably lawful context. Please reconsider your knowledge of this subject.

It's just given as an example of 'magic' that derivations entail.

This is wearing thin. You neither understand the definition of ad-hoc, nor what a 'type' is, nor what a list of operations is, nor what a half-baked DI scheme would entail from a lawful standpoint. Please stop.

Blanket statement; please address the point! As of now, I'm presented with zero proof that you have any idea what you're talking about, otherwise you'd be able to provide constructive input. This goes the same for our reddit discussions. If you want to make a stance against tooling – whether IDEs or derivations, just state so upfront.

@jdegoes
Copy link
Member Author

jdegoes commented Sep 28, 2018

@Milyardo @edmundnoble

Please test this code:

  def getURL(url: URL): IO[Exception, String] = ???

  def extractURLs(root: URL, html: String): List[URL] = ???

  final case class Crawl[E, A](error: E, value: A) {
    def leftMap[E2](f: E => E2): Crawl[E2, A] = Crawl(f(error), value)
    def map[A2](f: A => A2): Crawl[E, A2] = Crawl(error, f(value))
  }
  object Crawl {
    implicit def CrawlMonoid[E: Monoid, A: Monoid]: Monoid[Crawl[E, A]] =
      new Monoid[Crawl[E, A]]{
        def zero: Crawl[E, A] = Crawl(mzero[E], mzero[A])
        def append(l: Crawl[E, A], r: => Crawl[E, A]): Crawl[E, A] =
          Crawl(l.error |+| r.error, l.value |+| r.value)
      }
  }

  def crawlIO[E: Monoid, A: Monoid](
    seeds     : Set[URL],
    router    : URL => Set[URL],
    processor : (URL, String) => IO[E, A]): IO[Exception, Crawl[E, A]] = {
      def loop(seeds: Set[URL], visited: Ref[Set[URL]], crawl0: Ref[Crawl[E, A]]): IO[Exception, Crawl[E, A]] =
        (IO.parTraverse(seeds) { url =>
          for {
            html  <- getURL(url)
            crawl <- process1(url, html)
            links <- visited.get.map(extractURLs(url, html).toSet.flatMap(router) diff _)
          } yield (crawl, links)
        }).map(_.foldMap(identity)).flatMap {
          case (crawl1, links) =>
            visited.update(_ ++ seeds).flatMap(_ =>
              crawl0.update(_ |+| crawl1).flatMap(_ =>
                loop(links, visited, crawl0)
              )
            )
        }

      def process1(url: URL, html: String): IO[Nothing, Crawl[E, A]] =
        processor(url, html).redeemPure(Crawl(_, mzero[A]), Crawl(mzero[E], _))

      for {
        set       <- Ref(Set.empty[URL])
        crawlRef  <- Ref(mzero[Crawl[E, A]])
        crawl     <- loop(seeds, set, crawlRef)
      } yield crawl
    }

Your unit test should not interact with the real world.

Hint: You can't do it, because it's impossible.

Not without refactoring the code to introduce indirection. IO.sync and IO.async permit arbitrary hunks of effectful code to be incorporated into IO, therefore any type signature that features IO can do anything at all; in this case it makes arbitrary network connections.

If you want to test crawl without interacting with the real world, then you need to refactor crawl and pass in the set of effects that it uses, either with dictionaries or type classes.

A polymorphic version of crawl looks like this:

def crawl[F[_, _]: HttpClient: Effect, E: Monoid, A: Monoid](
    seeds     : Set[URL],
    router    : URL => Set[URL],
    processor : (URL, String) => F[E, A]): F[Exception, Crawl[E, A]] = {
      def loop(seeds: Set[URL], visited: Set[URL], crawl0: Crawl[E, A]): F[Exception, Crawl[E, A]] = ???

Note that it can work with any F that provides Effect (i.e. IO-like capability, but without sync or async). This means you can run crawl now with an F that does not interact with the real world, but still provides all the IO-like capabilities of Effect, as well as the HTTP-client like capabilities of HttpClient.

So in particular, you can call crawl with either a State-based implementation (say, case class TestIO[E, A](run: StateT[TestData, IO[E, ?], A])), or yet another IO-based implementation, but this one backed by test data, not real data. While all the operations of Effect will run using probably IO (why not?), the methods of HttpClient will now be simulated, using test data, and will not actually interact with the real world.

So while your test will still use or "compile" to IO, it will not interact with the external world, and can therefore be run quickly and deterministically in your test suite.

The alternatives to type classes over IO are modularizing IO and passing dictionaries / sets of functions. Each has tradeoffs as noted above.

@tonymorris
Copy link

Could you prove that? Better in formal manner.

I resent even wasting the effort on DI to type this sentence.

I mentioned that I meant development cost reduction. In USD.

You are almost certainly not going to convince me of this, so feel free to ignore my curiosity when I ask to see your methods and means of measurement.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

I resent even wasting the effort on DI to type this sentence.

So it's horseshit because you said, right? Considering all my respect to you - you aren't constructive at all.

and means of measurement.

The method is simple: automate the most expensive rituals from developer's daily life and decouple teams by introducing a powerful language to build APIs.

The measurement methodology is trivial: median feature/fix delivery time and some other KPIs.

@tonymorris
Copy link

Yeah, so I am going to call shenanigans. Scalaz is not the project for magic woowoo DI nonsense. There are lots of gullible Scala programmers out there, but not here. Try elsewhere.

"Try a global variable and make your code buggy. It will save you moneeeeeey!" -- Scalaz.

@pshirshov
Copy link
Member

pshirshov commented Sep 28, 2018

magic woowoo DI nonsense

There is no magic in these things. There isn't so much difference between, for example, FMs internal state (or zio internal state) and our planner state. The only difference is the domain. And the presence of garbage collector which we run when we finish building the "script" for our interpreter.

Try elsewhere.

Understood and considered.

"Try a global variable and make your code buggy. It will save you moneeeeeey!" -- Scalaz.

You've triggered on prohibited "DI" abbreviation not even trying to understand what are we talking about. There are no "global variables" nor anything people get used to find in DIs.

There is an effect-free planner which plans the job to be done, a garbage collector to throw out unneccessary operations and different interpreters allowing you do instantiate the context or write the tree (which would be very similar to one you write manually while wiring your application) or just print the plan.

I may give you some relief by telling you you may consider planner a freemonad (and it's not an issue to add such an interface to it) and main provisioner is one of the interpreters. Which may run during compilation just in case. Does it sound better now?

@tonymorris
Copy link

You've triggered on prohibited "DI" abbreviation not even trying to understand what are we talking about.

No I didn't. I was taking the piss. I have seen all the varieties of pretentiously passing function arguments, including the one you are proposing here.

@pshirshov
Copy link
Member

including the one you are proposing here.

Could you please point me to such a thing? From what I know there is nothing like that at the moment. Nor for scala nor for anything else.

@tonymorris
Copy link

The idea of abusing implicit to do "dependency injection" comes up all the time. I believe it will even appear in the language or standard library one day. It's bogus though. This proposal is just another variation on that abusing of implicit.

Here's a thing. Never use implicit in a way that you cannot also write a type-class in Haskell. Doing otherwise is a bad idea, though the reasons it is a bad idea are completely unrelated to Haskell. This thing is a good idea.

@pshirshov
Copy link
Member

Holy moly. How implicits are related to the thing I'm talking about?...

facepalm.jpg

@tonymorris
Copy link

Because you linked it.

link.jpg.png.pdf

@tonymorris
Copy link

@pshirshov
Copy link
Member

Remember. You would be surprised but it's not about DI over implicits. At all.

@tonymorris
Copy link

Cool. Show me the code.

@pshirshov
Copy link
Member

Fine. It's in the repo.

@tonymorris
Copy link

the

@tonymorris
Copy link

Which repo? Which code? Please show me the code.

@emilypi
Copy link

emilypi commented Sep 28, 2018

@tonymorris you have incredible patience.

@Milyardo
Copy link

Milyardo commented Sep 29, 2018

Your unit test should not interact with the real world.

At the bare minimum why could you not parameterize getURL. In this example this seems to be where you interact with the real(or at least outside) world.

https://scastie.scala-lang.org/hOVpVTalSBemLwNBJfM5Zg

In the polymorphic case, this function creates Refs, if you were to create a test IO effect, what would a TestIO do differently for these Refs?

@Milyardo
Copy link

Lets say were were to use Effect typeclasses, whats the benefit to implementing our own, versus implementing the cats one for interop? What would the differentiator between the two.

@neko-kai
Copy link
Member

neko-kai commented Sep 29, 2018

@tonymorris

Here's a thing. Never use implicit in a way that you cannot also write a type-class in Haskell.

You can easily write the filthiest possible DI as a Haskell type-class, your point?

{-# LANGUAGE UndecidableInstances, FlexibleInstances, AutoDeriveTypeable #-}
import Control.Monad.Reader.Class
import Control.Monad
import Data.Maybe
import Data.TMap as T
import Data.Typeable

class MonadHttp m where
  getPage :: String -> m Html

instance (Typeable m, MonadReader TMap m) => MonadHttp m where
  getPage s = ($ s) =<< asks (getPage' . fromJust . T.lookup)

data HttpClient m = HttpClient {
  getPage' :: String -> m Html
}

Made with ❤️ stack ❤️, btw.

The idea of abusing implicit to do "dependency injection" comes up all the time.

distage doesn't use implicits, except for materialising type tags.

Which repo? Which code? Please show me the code.

There's a GitHub link on the doc site. I trust you'll find it yourself, eventually.

@hobwekiva
Copy link

@jdegoes

Testing IO programs should be easy, which means that it should be possible to make programs that are polymorphic in the effect type, so that either IO may be used, or a hypothetical TestIO.

This might be of interest - https://hackage.haskell.org/package/IOSpec. I am very curious what is possible given a proper final tagless abstraction over IO's concurrency capabilities 👍

Overall, all things considered, and having seen the existence of BifunctorIO "in the wild", it seems like introducing a type class to describe IO and encouraging polymorphic code may be the most expected way of solving this problem.

Other options are conceptually very similar, but I think final tagless would be the most idiomatic and least error-prone.

@Kaishh

quantified implicits

❤️ I think the code can be written a bit clearer using macros (I have something similar in leibniz 1, 2). Could be extremely useful if it works consistently well. I actually would love to have something similar in scalaz proper.

@edmundnoble

Furthermore, there are issues with having a type class over two-parameter type constructors: you can't use transformers at all anymore, and code must run in IO. At that point, the abstraction is useless, because there can be only one IO monad.

Both statements seem correct but miss the point and don't provide an explanation. IIUC you can't use transformers because you would need "quantified implicits" for MTL (or QuantifiedContexts).

There can be only one IO monad.

That doesn't mean that there is only one possible implementation for some Foo[F[_]] for a sufficiently restricted Foo and that's exactly what @jdegoes is arguing for. Remove sync and async and you can run everything in State modulo concurrency, for example. See https://hackage.haskell.org/package/IOSpec.

@Milyardo

At the bare minimum why could you not parameterize getURL. In this example this seems to be where you interact with the real(or at least outside) world.

What if someone uses sync somewhere by mistake? You need to track down all such places and replace them with parameters. Polymorphism guarantees you that there are no such places.

@Kaishh @pshirshov

https://blog.softwaremill.com/what-is-dependency-injection-8c9e7805502f makes a lot of sense.

https://izumi.7mind.io/v0.5.50-SNAPSHOT/doc/distage/ needs an explanation of what all that magic transforms into. E.g. what does make do, what many does etc. You might want to adapt the language (and terminology) for FP people to understand it, it is currently a bit vague IMHO.

You can easily write the filthiest possible DI as a Haskell type-class, your point?

You could also implement the same thing using reflection. It's not clear to me how significant the advantage over passing dependencies explicitly (as a dictionary) is, although I recognize the O(n^2) problems associated with explicit dependency management.

If I understand correctly, DI amounts to: Given a dependency graph that can be represented as a collection of functions (C1, ... Cn) => Cx (plus maybe some constraints like C1 is initialized before C3), and some external constraints write a program that produces all components C_i. If this could be done at compile-time, it would be a pretty useful tool. Passing those components further down the line could be done with implicits (think reflection) or explicitly.

However, you lose a clear understanding of what your components actually do and replace explicit dependency passing with some opaque algorithm. This would be my main concern if I was using such a system, especially for testing, because I need a very clear understanding what my code does when I am testing something.

Though fine. In case I'm wrong here - I'm very sorry and wish everyone who was offended all the best.

I think that everyone is being a bit too abrasive for no good reason.

@Milyardo
Copy link

Milyardo commented Oct 2, 2018

@alexknvl

What if someone uses sync somewhere by mistake? You need to track down all such places and replace them with parameters. Polymorphism guarantees you that there are no such places.

Isn't that why we're writing tests instead of relying on the compiler to find bugs for us?

@hobwekiva
Copy link

@Milyardo

Isn't that why we're writing tests instead of relying on the compiler to find bugs for us?

How would you test that nothing in your code uses sync despite being given access to it and unless it is in some blessed parts of the code like MVar?

@jdegoes
Copy link
Member Author

jdegoes commented Oct 3, 2018

What if someone uses sync somewhere by mistake? You need to track down all such places and replace them with parameters. Polymorphism guarantees you that there are no such places.

This is the most compelling argument for Effect[F[_, _]]. It's more machinery than stuffing your IO functions into a data structure, but you have stronger guarantees.

@Milyardo
Copy link

Milyardo commented Oct 3, 2018

I had assumed a mythical effect typeclass would still have sync, and if it doesn't, how it not just a plain bimonad?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

10 participants