Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign #155

Merged
merged 68 commits into from Sep 18, 2018
Merged
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
3975fc5
:baby: steps
purrgrammer Aug 23, 2018
9d2f20f
:fire:
purrgrammer Aug 23, 2018
b1f5221
:fire: env in error
purrgrammer Aug 23, 2018
e02c9ac
:fire: FetchMonadError, interpret Fetch to IO
purrgrammer Aug 23, 2018
a21c1c7
:fire:
purrgrammer Aug 24, 2018
4adb52c
:fire: :fire: :fire:
purrgrammer Aug 24, 2018
543f875
High-level API working, now onto the execution model
purrgrammer Aug 27, 2018
658574e
:baby: run Fetch with an environment
purrgrammer Aug 27, 2018
c9bbba6
Use Deferred and IO internally for storing intermediate results
purrgrammer Aug 30, 2018
3eb78ff
:fire:
purrgrammer Aug 30, 2018
2098a62
Failing test to implement batching
purrgrammer Aug 30, 2018
1d3fa96
Time requests
purrgrammer Aug 30, 2018
d8ff3b8
Test execution model
purrgrammer Aug 31, 2018
713a95d
:fire:
purrgrammer Aug 31, 2018
09aa43a
:fire: monix subproject
purrgrammer Aug 31, 2018
5de8412
Async queries
purrgrammer Aug 31, 2018
28a98e4
Caching
purrgrammer Sep 2, 2018
c89a8c7
Tweaks to caching
purrgrammer Sep 2, 2018
4a5f42f
Error handling
purrgrammer Sep 2, 2018
82a6d24
Test batching
purrgrammer Sep 3, 2018
e5a3b92
wip environment debug code
purrgrammer Sep 3, 2018
12719b9
Use IO in DataSource
purrgrammer Sep 3, 2018
1c4a479
:fire: cleanup
purrgrammer Sep 3, 2018
edffae2
wip docs
purrgrammer Sep 3, 2018
346358f
Introduce parallelism
purrgrammer Sep 3, 2018
3223dcc
Progress on documentation
purrgrammer Sep 3, 2018
917fcb1
Update cache after fetching a batch
purrgrammer Sep 3, 2018
62a3f0d
Run both sides of a fetch with .tupled
purrgrammer Sep 3, 2018
04dcfdf
Use value classes for the in-memory cache
purrgrammer Sep 3, 2018
9f77120
Require DataSource in Fetch#apply
purrgrammer Sep 4, 2018
f73360a
Update Fetch#apply calls in docs
purrgrammer Sep 4, 2018
7570eb5
Add note to self
purrgrammer Sep 4, 2018
372fad4
cleanup
purrgrammer Sep 4, 2018
1015345
Improve cache interface and Ref usage
purrgrammer Sep 4, 2018
7c7fc86
Combine results in parallel and some cleanup
purrgrammer Sep 4, 2018
a6c9129
Tweaks to documentation
purrgrammer Sep 5, 2018
aea947a
Wip
purrgrammer Sep 5, 2018
ce7311f
Adapt examples project to new API
purrgrammer Sep 6, 2018
a5a870f
Update README tut source
purrgrammer Sep 6, 2018
77d0fb2
Update generated README
purrgrammer Sep 6, 2018
ab5e799
Parallel automatic batching when possible
purrgrammer Sep 6, 2018
87c68fe
Show timing in env descriptions
purrgrammer Sep 6, 2018
d62826e
Require Parallel[IO, IO.Par] evidence for batching
purrgrammer Sep 6, 2018
6bc5980
Include Env in errors, update debugging code
purrgrammer Sep 7, 2018
846f2be
Fix examples
purrgrammer Sep 7, 2018
5050a45
Use AsyncFreeSpec to run tests in JS
purrgrammer Sep 7, 2018
ad41319
Make sure batched fetches with missing identities fail
purrgrammer Sep 7, 2018
64eb312
Make internal Semigroup instances private
purrgrammer Sep 8, 2018
4dbe795
Use more explicit IO constructors
purrgrammer Sep 9, 2018
30e09ea
Don't use Semigroup for combining
purrgrammer Sep 9, 2018
1e23998
Remove unneeded type casts
purrgrammer Sep 9, 2018
d62165c
Update README
purrgrammer Sep 9, 2018
e789482
Incorrect tailRecM
purrgrammer Sep 9, 2018
53f5856
Syntax
purrgrammer Sep 11, 2018
c29d8a0
🔥 just a poc, does not compile
gatorcse Sep 7, 2018
bd422e0
Fetch implementation parameterised to F[_]
purrgrammer Sep 11, 2018
cb32e20
Minor cleanup
purrgrammer Sep 12, 2018
01fe2aa
Update examples
purrgrammer Sep 12, 2018
0897ed4
Update doc examples
purrgrammer Sep 13, 2018
184e8a9
Add Par to `DataSource#fetch` implicits
purrgrammer Sep 14, 2018
3e90ff5
Add tests to validation commands
purrgrammer Sep 14, 2018
f7dd3bc
Update README
purrgrammer Sep 14, 2018
fee19d6
Hide implementation
peterneyens Sep 17, 2018
a26faf0
Replace `>>` with functor syntax
peterneyens Sep 17, 2018
554ab7a
Some refactoring
peterneyens Sep 17, 2018
e907a69
Simplify flatMap
peterneyens Sep 17, 2018
b09f1ed
Merge pull request #157 from 47deg/redesign-peter
purrgrammer Sep 17, 2018
3fe15b3
Bump 1.0.0-RC1 release
purrgrammer Sep 18, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
153 changes: 95 additions & 58 deletions README.md
Expand Up @@ -2,7 +2,7 @@

[comment]: # (Start Badges)

[![Join the chat at https://gitter.im/47deg/fetch](https://badges.gitter.im/47deg/fetch.svg)](https://gitter.im/47deg/fetch?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![Build Status](https://travis-ci.org/47deg/fetch.svg?branch=master)](https://travis-ci.org/47deg/fetch) [![codecov.io](http://codecov.io/github/47deg/fetch/coverage.svg?branch=master)](http://codecov.io/github/47deg/fetch?branch=master) [![Maven Central](https://img.shields.io/badge/maven%20central-0.7.3-green.svg)](https://oss.sonatype.org/#nexus-search;gav~com.47deg~fetch*) [![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](https://raw.githubusercontent.com/47deg/fetch/master/LICENSE) [![Latest version](https://img.shields.io/badge/fetch-0.7.3-green.svg)](https://index.scala-lang.org/47deg/fetch) [![Scala.js](http://scala-js.org/assets/badges/scalajs-0.6.17.svg)](http://scala-js.org) [![GitHub Issues](https://img.shields.io/github/issues/47deg/fetch.svg)](https://github.com/47deg/fetch/issues)
[![Join the chat at https://gitter.im/47deg/fetch](https://badges.gitter.im/47deg/fetch.svg)](https://gitter.im/47deg/fetch?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![Build Status](https://travis-ci.org/47deg/fetch.svg?branch=master)](https://travis-ci.org/47deg/fetch) [![codecov.io](http://codecov.io/github/47deg/fetch/coverage.svg?branch=master)](http://codecov.io/github/47deg/fetch?branch=master) [![Maven Central](https://img.shields.io/badge/maven%20central-0.6.1-green.svg)](https://oss.sonatype.org/#nexus-search;gav~com.47deg~fetch*) [![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](https://raw.githubusercontent.com/47deg/fetch/master/LICENSE) [![Latest version](https://img.shields.io/badge/fetch-0.6.1-green.svg)](https://index.scala-lang.org/47deg/fetch) [![Scala.js](http://scala-js.org/assets/badges/scalajs-0.6.15.svg)](http://scala-js.org) [![GitHub Issues](https://img.shields.io/github/issues/47deg/fetch.svg)](https://github.com/47deg/fetch/issues)

[comment]: # (End Badges)

Expand Down Expand Up @@ -38,77 +38,99 @@ Or, if using Scala.js (0.6.x):
Fetch is a library for making access to data both simple & efficient. Fetch is especially useful when querying data that
has a latency cost, such as databases or web services.

## Create a runtime

Since `Fetch` relies on `IO` from the `cats-effect` library, we'll need a runtime for executing our `IO` instances. This includes a `ContextShift[IO]` used for running the `IO` instances and a `Timer[IO]` that is used for scheduling, let's go ahead and create them, we'll use a `java.util.concurrent.ScheduledThreadPoolExecutor` with a couple of threads to run our fetches.

```scala
import java.util.concurrent._
import scala.concurrent.ExecutionContext
import cats.effect.{ IO, Timer, ContextShift }

val executor = new ScheduledThreadPoolExecutor(2)
val executionContext: ExecutionContext = ExecutionContext.fromExecutor(executor)

implicit val timer: Timer[IO] = IO.timer(executionContext)
implicit val cs: ContextShift[IO] = IO.contextShift(executionContext)
```

## Define your data sources

To tell `Fetch` how to get the data you want, you must implement the `DataSource` typeclass. Data sources have `fetchOne` and `fetchMany` methods that define how to fetch such a piece of data.
To tell Fetch how to get the data you want, you must implement the `DataSource` typeclass. Data sources have `fetch` and `batch` methods that define how to fetch such a piece of data.

Data Sources take two type parameters:

1. `Identity` is a type that has enough information to fetch the data. For a users data source, this would be a user's unique ID.
2. `Result` is the type of data we want to fetch. For a users data source, this would the `User` type.
<ol>
<li><code>Identity</code> is a type that has enough information to fetch the data</li>
<li><code>Result</code> is the type of data we want to fetch</li>
</ol>

```scala
import cats.Parallel
import cats.data.NonEmptyList

trait DataSource[Identity, Result]{
def name: String
def fetchOne(id: Identity): Query[Option[Result]]
def fetchMany(ids: NonEmptyList[Identity]): Query[Map[Identity, Result]]
def fetch(id: Identity): IO[Option[Result]]
def batch(ids: NonEmptyList[Identity])(
implicit P: Parallel[IO, IO.Par]
): IO[Map[Identity, Result]]
}
```

We'll implement a dummy data source that can convert integers to strings. For convenience, we define a `fetchString` function that lifts identities (`Int` in our dummy data source) to a `Fetch`.
Returning `IO` instances from the fetch methods allows us to specify if the fetch must run synchronously or asynchronously and use all the goodies available in `cats` and `cats-effect`.

We'll implement a dummy data source that can convert integers to strings. For convenience, we define a `fetchString` function that lifts identities (`Int` in our dummy data source) to a `Fetch`.

```scala
import scala.concurrent.duration._
import cats.Parallel
import cats.data.NonEmptyList
import cats.instances.list._
import cats.syntax.all._
import fetch._

implicit object ToStringSource extends DataSource[Int, String]{
override def name = "ToString"
override def fetchOne(id: Int): Query[Option[String]] = {
Query.sync({
println(s"[${Thread.currentThread.getId}] One ToString $id")
Option(id.toString)
})

override def fetch(id: Int): IO[Option[String]] = {
IO(println(s"--> [${Thread.currentThread.getId}] One ToString $id")) >>
IO.sleep(10.milliseconds) >>
IO(println(s"<-- [${Thread.currentThread.getId}] One ToString $id")) >>
IO(Option(id.toString))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use IO.delay instead IO.apply, since it states clearly what's doing.

Also, for the lifting pure values (such as Option(id.toString)) you can use IO.pure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'm going to update usages of IO#apply to be more explicit with IO#delay and IO#pure.

}
override def fetchMany(ids: NonEmptyList[Int]): Query[Map[Int, String]] = {
Query.sync({
println(s"[${Thread.currentThread.getId}] Many ToString $ids")
ids.toList.map(i => (i, i.toString)).toMap
})

override def batch(ids: NonEmptyList[Int])(
implicit P: Parallel[IO, IO.Par]
): IO[Map[Int, String]] = {
IO(println(s"--> [${Thread.currentThread.getId}] Batch ToString $ids")) >>
IO.sleep(10.milliseconds) >>
IO(println(s"<-- [${Thread.currentThread.getId}] Batch ToString $ids")) >>
IO(ids.toList.map(i => (i, i.toString)).toMap)
}
}

def fetchString(n: Int): Fetch[String] = Fetch(n) // or, more explicitly: Fetch(n)(ToStringSource)
def fetchString(n: Int): Fetch[String] = Fetch(n, ToStringSource)
```

## Creating and running a fetch

Now that we can convert `Int` values to `Fetch[String]`, let's try creating a fetch.

```scala
import fetch.syntax._

val fetchOne: Fetch[String] = fetchString(1)
```

We'll run our fetches to the ambient `Id` monad in our examples. Note that in real-life scenarios you'll want to run a fetch to a concurrency monad such as `Future` or `Task`, synchronous execution of a fetch is only supported in Scala and not Scala.js and is meant for experimentation purposes.
Let's run it and wait for the fetch to complete, we'll use `IO#unsafeRunTimed` for testing purposes, which will run an `IO[A]` to `Option[A]` and return `None` if it didn't complete in time:

```scala
import cats.Id
import fetch.unsafe.implicits._
import fetch.syntax._
Fetch.run(fetchOne).unsafeRunTimed(5.seconds)
// --> [92] One ToString 1
// <-- [93] One ToString 1
// res0: Option[String] = Some(1)
```

Let's run it and wait for the fetch to complete:

```scala
fetchOne.runA[Id]
// [260] One ToString 1
// res3: cats.Id[String] = 1
```
As you can see in the previous example, the `ToStringSource` is queried once to get the value of 1.

## Batching

Expand All @@ -123,39 +145,42 @@ val fetchThree: Fetch[(String, String, String)] = (fetchString(1), fetchString(2
When executing the above fetch, note how the three identities get batched and the data source is only queried once.

```scala
fetchThree.runA[Id]
// [260] Many ToString NonEmptyList(3, 1, 2)
// res5: cats.Id[(String, String, String)] = (1,2,3)
Fetch.run(fetchThree).unsafeRunTimed(5.seconds)
// --> [92] Batch ToString NonEmptyList(1, 2, 3)
// <-- [93] Batch ToString NonEmptyList(1, 2, 3)
// res1: Option[(String, String, String)] = Some((1,2,3))
```

Note that the `DataSource#batch` method is not mandatory, it will be implemented in terms of `DataSource#fetch` if you don't provide an implementation.

## Parallelism

If we combine two independent fetches from different data sources, the fetches can be run in parallel. First, let's add a data source that fetches a string's size.

This time, instead of creating the results with `Query#sync` we are going to do it with `Query#async` for emulating an asynchronous data source.

```scala
implicit object LengthSource extends DataSource[String, Int]{
override def name = "Length"
override def fetchOne(id: String): Query[Option[Int]] = {
Query.async((ok, fail) => {
println(s"[${Thread.currentThread.getId}] One Length $id")
ok(Option(id.size))
})

override def fetch(id: String): IO[Option[Int]] = {
IO(println(s"--> [${Thread.currentThread.getId}] One Length $id")) >>
IO.sleep(10.milliseconds) >>
IO(println(s"<-- [${Thread.currentThread.getId}] One Length $id")) >>
IO(Option(id.size))
}
override def fetchMany(ids: NonEmptyList[String]): Query[Map[String, Int]] = {
Query.async((ok, fail) => {
println(s"[${Thread.currentThread.getId}] Many Length $ids")
ok(ids.toList.map(i => (i, i.size)).toMap)
})
override def batch(ids: NonEmptyList[String])(
implicit P: Parallel[IO, IO.Par]
): IO[Map[String, Int]] = {
IO(println(s"--> [${Thread.currentThread.getId}] Batch Length $ids")) >>
IO.sleep(10.milliseconds) >>
IO(println(s"<-- [${Thread.currentThread.getId}] Batch Length $ids")) >>
IO(ids.toList.map(i => (i, i.size)).toMap)
}
}

def fetchLength(s: String): Fetch[Int] = Fetch(s)
def fetchLength(s: String): Fetch[Int] = Fetch(s, LengthSource)
```

And now we can easily receive data from the two sources in a single fetch.
And now we can easily receive data from the two sources in a single fetch.

```scala
val fetchMulti: Fetch[(String, Int)] = (fetchString(1), fetchLength("one")).tupled
Expand All @@ -164,17 +189,21 @@ val fetchMulti: Fetch[(String, Int)] = (fetchString(1), fetchLength("one")).tupl
Note how the two independent data fetches run in parallel, minimizing the latency cost of querying the two data sources.

```scala
fetchMulti.runA[Id]
// [260] One ToString 1
// [261] One Length one
// res7: cats.Id[(String, Int)] = (1,3)
Fetch.run(fetchMulti).unsafeRunTimed(5.seconds)
// --> [93] One ToString 1
// --> [92] One Length one
// <-- [93] One ToString 1
// <-- [92] One Length one
// res2: Option[(String, Int)] = Some((1,3))
```

## Caching

When fetching an identity, subsequent fetches for the same identity are cached. Let's try creating a fetch that asks for the same identity twice.

```scala
import cats.syntax.all._

val fetchTwice: Fetch[(String, String)] = for {
one <- fetchString(1)
two <- fetchString(1)
Expand All @@ -184,19 +213,27 @@ val fetchTwice: Fetch[(String, String)] = for {
While running it, notice that the data source is only queried once. The next time the identity is requested it's served from the cache.

```scala
fetchTwice.runA[Id]
// [260] One ToString 1
// res8: cats.Id[(String, String)] = (1,1)
Fetch.run(fetchTwice).unsafeRunTimed(5.seconds)
// --> [93] One ToString 1
// <-- [92] One ToString 1
// res3: Option[(String, String)] = Some((1,1))
```




---

## Fetch in the wild

If you wish to add your library here please consider a PR to include it in the list below.

[comment]: # (Start Copyright)

# Copyright

Fetch is designed and developed by 47 Degrees

Copyright (C) 2016-2018 47 Degrees. <http://47deg.com>

[comment]: # (End Copyright)
[comment]: # (End Copyright)
25 changes: 5 additions & 20 deletions build.sbt
Expand Up @@ -8,7 +8,7 @@ lazy val root = project
.in(file("."))
.settings(name := "fetch")
.settings(moduleName := "root")
.aggregate(fetchJS, fetchJVM, fetchMonixJVM, fetchMonixJS, debugJVM, debugJS, twitterJVM)
.aggregate(fetchJS, fetchJVM, debugJVM, debugJS)

lazy val fetch = crossProject
.in(file("."))
Expand All @@ -19,16 +19,6 @@ lazy val fetch = crossProject
lazy val fetchJVM = fetch.jvm
lazy val fetchJS = fetch.js

lazy val monix = crossProject
.in(file("monix"))
.dependsOn(fetch % "compile->compile;test->test")
.settings(name := "fetch-monix")
.jsSettings(sharedJsSettings: _*)
.crossDepSettings(commonCrossDependencies ++ monixCrossDependencies: _*)

lazy val fetchMonixJVM = monix.jvm
lazy val fetchMonixJS = monix.js

lazy val debug = (crossProject in file("debug"))
.settings(name := "fetch-debug")
.dependsOn(fetch)
Expand All @@ -38,22 +28,17 @@ lazy val debug = (crossProject in file("debug"))
lazy val debugJVM = debug.jvm
lazy val debugJS = debug.js

lazy val twitter = crossProject
.in(file("twitter"))
.settings(name := "fetch-twitter")
.dependsOn(fetch % "compile->compile;test->test")
.crossDepSettings(commonCrossDependencies ++ twitterUtilDependencies: _*)

lazy val twitterJVM = twitter.jvm

lazy val examples = (project in file("examples"))
.settings(name := "fetch-examples")
.dependsOn(fetchJVM)
.settings(noPublishSettings: _*)
.settings(examplesSettings: _*)
.settings(Seq(
resolvers += Resolver.sonatypeRepo("snapshots")
))

lazy val docs = (project in file("docs"))
.dependsOn(fetchJVM, fetchMonixJVM, debugJVM)
.dependsOn(fetchJVM, debugJVM)
.settings(name := "fetch-docs")
.settings(docsSettings: _*)
.settings(noPublishSettings)
Expand Down