In [1]:
import $file.common
import common._
import cats._, cats.implicits._, cats.data._
import fs2.Stream
import doobie._, doobie.implicits._

[32mimport [39m[36m$file.$     
[39m
[32mimport [39m[36mcommon._
[39m
[32mimport [39m[36mcats._, cats.implicits._, cats.data._
[39m
[32mimport [39m[36mfs2.Stream
[39m
[32mimport [39m[36mdoobie._, doobie.implicits._[39m

# Variation 4. MTL-based Repositories

The DAO approach is doomed. It forces us to commit to particular computation types: `Id[_]` (i.e. synchronous), `cats.effect.IO[_]`, `scala.concurrent.Future[_]`, etc. (i.e. asynchronous), `cats.StateT[_, _]` (if we want to do unit testing without mocking), etc. All these computation types demand particular DAO APIs, which would, in turn, demand particular business logic implementations. Clearly, this is a no-good. What if we had a truly generic DAO API that could be accommodated to any single computation type? This is what the MTL-style offers to us: the possibility of programming DAO APIs which are parameterised by any computation type we like.

In [2]:
// Case classes, as before

case class Country(code: String, name: String, capital: Option[Int])
case class City(id: Int, name: String, countryCode: String, population: Int)

// DAO APIs as type constructor classes

trait CityRepo[F[_]]{
    def city(id: Int): F[City]
    def cityName(id: Int): F[String]
    def cityPopulation(id: Int): F[Int]
    def cityCountryCode(id: Int): F[String]
}

trait CountryRepo[F[_]]{
    def country(code: String): F[Country]
    def countryName(code: String): F[String]
    def countryCapital(code: String): F[Option[Int]]
}

trait WorldRepo[F[_]] extends CityRepo[F] with CountryRepo[F]{
    def allCountries: F[Country]
    def allCountryCodes: F[String]
    def allCityIds: F[Int]
}

defined [32mclass[39m [36mCountry[39m
defined [32mclass[39m [36mCity[39m
defined [32mtrait[39m [36mCityRepo[39m
defined [32mtrait[39m [36mCountryRepo[39m
defined [32mtrait[39m [36mWorldRepo[39m

Here, the type constructor parameter `F[_]` represents a generic computation type, and these APIs represent classes of computations, namely those computation types which can allow us to access world data. Queries are then programmed much in the same way as before, only that we need extra APIs to compose the instructions of the domain repository models: `Monad[_[_]]` and `FunctionFilter[_[_]]`. These are also computation classes as well: the class of imperative computations and the computations that can be filtered, respectively. This is all we need in order to write our `largeCapitals` query, once and for all, in the same readable way as before:

In [3]:
def largeCapitals[F[_]: Monad: FunctorFilter](implicit W: WorldRepo[F]): F[(String, String)] = for {
    Country(_, name, Some(capital)) <- W.allCountries
    city <- W.city(capital)
    if city.population > 8000000
} yield (city.name, name)

defined [32mfunction[39m [36mlargeCapitals[39m

Can we really run this business logic for any kind of computation? Let's try it.

### Stream-based Doobie implementation

We will first try to run our query against the world database. In order to do so, we choose to translate `WorldRepo` instructions in terms of `Stream[ConnectionIO, T]]` computations, i.e. computations that eventually (when run) return a stream of values of type `T` obtained through JDBC `ConnectionIO` programs. This is the instance:

In [4]:
type DoobieStr[T] = Stream[ConnectionIO, T]

implicit object DoobieStrWorldRepo extends WorldRepo[DoobieStr]{

    def city(id: Int): Stream[ConnectionIO, City] = 
        sql"select id, name, countryCode, population from city where id = $id"
            .query[City].stream
    
    def cityName(id: Int): Stream[ConnectionIO, String] = 
        sql"select name from city where id = $id"
            .query[String].stream
    
    def cityPopulation(id: Int): Stream[ConnectionIO, Int] = 
        sql"select population from city where id = $id"
            .query[Int].stream

    def cityCountryCode(id: Int): Stream[ConnectionIO, String] = 
        sql"select population from city where id = $id"
            .query[String].stream

    def country(code: String): Stream[ConnectionIO, Country] = 
        sql"select code, name, capital from country where code = $code"
            .query[Country].stream
    
    def countryName(id: String): Stream[ConnectionIO, String] =
        sql"select name from country where id = $id"
            .query[String].stream
    
    def countryCapital(id: String): Stream[ConnectionIO, Option[Int]] =
        sql"select capital from country where id = $id"
            .query[Option[Int]].stream
    
    def allCountries: Stream[ConnectionIO, Country] = 
        sql"select code, name, capital from country"
            .query[Country].stream
    
    def allCountryCodes: Stream[ConnectionIO, String] = 
        sql"select code from country"
            .query[String].stream
    
    def allCityIds: Stream[ConnectionIO, Int] = 
        sql"select code, name, capital from country"
            .query[Int].stream
}

defined [32mtype[39m [36mDoobieStr[39m
defined [32mobject[39m [36mDoobieStrWorldRepo[39m

In order to run the query, we just need to specify the desired computation type (all the required dependencies will be injected automatically through the implicit mechanism); then, we compile the stream and the JDBC program, and, last, interpret the resulting IO program:

In [5]:
largeCapitals[DoobieStr] // Stream[ConnectionIO, (String, String)]
    .compile.toList      // ConnectionIO[List[(String, String)]]
    .transact(xa)        // IO[List[(String, String)]]
    .unsafeRunSync       // List[(String, String)]

[36mres4[39m: [32mList[39m[([32mString[39m, [32mString[39m)] = [33mList[39m(
  ([32m"Jakarta"[39m, [32m"Indonesia"[39m),
  ([32m"Seoul"[39m, [32m"South Korea"[39m),
  ([32m"Ciudad de M\u00e9xico"[39m, [32m"Mexico"[39m),
  ([32m"Moscow"[39m, [32m"Russian Federation"[39m)
)

It works! Can we also do unit testing?

### Unit testing with `StateT`

Unit testing can be done in a purely functional way, i.e. without mocking libraries, using a particular type of computation: state transformers. The basic idea is to interpret domain instructions in terms of transformations or queries over the `World` state (which is represented as an in-memory data type). In our simplified case, we don't have transformations, so a computation `World => List[T]` suffices. 

In [6]:
case class World(
    countries: Map[String, Country],
    cities: Map[Int, City])

object World{
    
    type State[T] = StateT[List, World, T]

    implicit object StateTWorldRepo extends WorldRepo[State]{

        // Cities
        
        def city(id: Int): State[City] = 
            StateT.inspectF(_.cities.get(id).toList)

        def cityName(id: Int): State[String] = 
            city(id).map(_.name)

        def cityCountryCode(id: Int): State[String] = 
            city(id).map(_.countryCode)

        def cityPopulation(id: Int): State[Int] = 
            city(id).map(_.population)

        //  Countries
        
        def country(code: String): State[Country] =
            StateT.inspectF(_.countries.get(code).toList)

        def countryName(code: String): State[String] =
            country(code).map(_.name)

        def countryCapital(code: String): State[Option[Int]] =
            country(code).map(_.capital)
        
        // World
        
        def allCityIds: State[Int] = 
            StateT.inspectF(_.cities.keys.toList)

        def allCountryCodes: State[String] = 
            StateT.inspectF(_.countries.keys.toList)

        def allCountries: State[Country] = 
            StateT.inspectF(_.countries.values.toList)
    }
}

defined [32mclass[39m [36mWorld[39m
defined [32mobject[39m [36mWorld[39m

The very same scalatest specification than before will serve:

In [7]:
import org.scalatest._

class LargeCapitalsSpec(largeCapitals: World => List[(String, String)])
extends FlatSpec with Matchers{
    
    val smallWorld: World =         
        World(Map("ES" -> Country("ES","Spain",Some(0)),
                "USA" -> Country("USA", "United States", Some(1)),
                "UK" -> Country("UK", "United Kingdom", Some(2)),
                "UNK" -> Country("UNK", "Unknown", None)),
        Map(0->City(0,"Madrid","ES",9000000),
            1->City(1,"Washington", "USA", 10000000),
            2->City(2,"London", "UK", 500000)))    
    
    "large capitals" should "be right" in {
        largeCapitals(smallWorld).toSet shouldBe 
            Set(("Madrid", "Spain"), ("Washington", "United States"))
    }
}

[32mimport [39m[36morg.scalatest._

[39m
defined [32mclass[39m [36mLargeCapitalsSpec[39m

Now, in order to unit test our query, we just compile to the required type discarding the resulting state, using `runA`:

In [8]:
run(new LargeCapitalsSpec(largeCapitals[World.State].runA))

[32mcmd6$Helper$LargeCapitalsSpec:[0m
[32mlarge capitals[0m
[32m- should be right[0m


### However ... 

We were so obsessed with modularity that we didn't pay attention to the performance of our interpreters. This is not so important in the unit testing interpreter, but the doobie one ... really matters. Let's obtain some figures:

In [9]:
val mtlTime: Long = 
    largeCapitals[DoobieStr] // Stream[ConnectionIO, (String, String)]
        .compile.toList      // ConnectionIO[List[(String, String)]]
        .transact(xa)        // IO[List[(String, String)]]
        .unsafeRunSync       // List[(String, String)]
        .timed(50)
        ._2

val sqlTime: Long = 
    sql"""
        | select C.name, X.name 
        | from city as C, country as X 
        | where C.id = X.capital and C.population > 8000000""".stripMargin
        .query[(String, String)]
        .to[List]
        .transact(xa)
        .unsafeRunSync
        .timed(50)
        ._2

println(s"ratio: ${mtlTime/sqlTime}")

ratio: 11


[36mmtlTime[39m: [32mLong[39m = [32m159290784L[39m
[36msqlTime[39m: [32mLong[39m = [32m13874006L[39m

Around five to ten times more than the plain sql query ... Something's going on here. Effectively, the postgres log tells us that we are suffering from the so-called _avalance query_ problem. Each world model instruction from the repositories is compiled into an independent SQL query, and the interpreter does nothing to reassemble those queries into the optimum one. And the worst thing is that the more data we have (the more countries, in this case), the more inefficient our query will be.

Are we really lost here? Can't we somehow implement a smart interpreter that generates the optimum query? Yes, enter the field of [Quoted Domain Specific Languages](Variation5.Quill.ipynb)!