New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Arrow Query Language. #1079

Merged
merged 20 commits into from Nov 11, 2018

Conversation

Projects
None yet
6 participants
@raulraja
Member

raulraja commented Nov 1, 2018

The Arrow query language (name TBD) offers SQL, Monoid comprehensions and LINQ
style queries generalized to all data types that can provide instances for the type classes that it uses internally.

This works is inspired in random ideas drawn from:

AQL gives you a unified single language that is familiar to users and that does not require users to understand the type class combinators.

AQL covers any use cases in which you are extracting and transforming data and your data is contained within a data type for which a kinded representation exists.

Ex:

  • query a remote data source Deferred, IO
  • query a remote spark distributed data set: Dataset
  • query an in-memory collection List, Set, ..
  • query nested type constructors. Ex: Deferred<List<A>>
  • query JSON trees and in general any kinded sum type since we can derive Foldable instances for those

In general, you can use DSL combinators over any data type value as long as the data type provides instances for the combinator in use.

The current set of type operators and type class dependencies are:

TypeClass Dependency
Select Functor
From (Data type source)
Join Applicative
Where FunctorFilter
Count Foldable
Sum Foldable
GroupBy Foldable
OrderBy Foldable

Instances for some of the common data types will be included as well as submodules for the different arrow integrations.

Before this is merged I will fix the @extension processor to support infix type class extension projections in order to achieve true SQL like syntax beside the fluent builder.

Additionally in AQL we have freedom for naming combinators like bind or flatMap to other names like chain. This is in the same line all other SQL expressions try to ressemble plain english, and these would be just a functional superset over sql similar in spirit to LINQ` but supporting other type classes like error handling, traversing, effects, etc.

A few examples of usage of AQL in action over theList below:

package arrow.aql.tests

import arrow.aql.Ord
import arrow.aql.instances.id.select.value
import arrow.aql.instances.list.count.count
import arrow.aql.instances.list.count.value
import arrow.aql.instances.list.from.join
import arrow.aql.instances.list.groupBy.groupBy
import arrow.aql.instances.list.orderBy.order
import arrow.aql.instances.list.orderBy.orderMap
import arrow.aql.instances.list.select.query
import arrow.aql.instances.list.select.select
import arrow.aql.instances.list.select.value
import arrow.aql.instances.list.sum.sum
import arrow.aql.instances.list.union.union
import arrow.aql.instances.list.where.where
import arrow.aql.instances.list.where.whereSelection
import arrow.aql.instances.listk.select.select
import arrow.aql.instances.listk.select.selectAll
import arrow.core.Id
import arrow.instances.order
import arrow.test.UnitSpec
import io.kotlintest.KTestJUnitRunner
import io.kotlintest.matchers.shouldBe
import org.junit.runner.RunWith

@RunWith(KTestJUnitRunner::class)
class AQLTests : UnitSpec() {

  init {

    "AQL is able to `select`" {
      listOf(1, 2, 3).query {
        select { this * 10 }
      }.value() shouldBe listOf(10, 20, 30)
    }

    "AQL is able to `select count`" {
      listOf(1, 2, 3).query { select { this } }.count()
        .value() shouldBe 3L
    }

    "AQL is able to `select`, transform and filter data with `where`" {
      listOf(1, 2, 3).query {
        selectAll() where { this > 2 }
      }.value() shouldBe listOf(3)
    }

    "AQL is able to `select`, transform and filter data with `in`" {
      listOf(1, 2, 3).query {
        selectAll() where { this in listOf(3) }
      }.value() shouldBe listOf(3)
    }

    "AQL is able to `join` and transform data for List" {
      (listOf(1) join listOf("a")).query {
        select { "$a$b" } where { a > 0 } whereSelection { startsWith("1") }
      }.value() shouldBe listOf("1a")
    }

    "AQL is able to `groupBy`" {
      data class Student(val name: String, val age: Int)

      val john = Student("John", 30)
      val jane = Student("Jane", 32)
      val jack = Student("Jack", 32)
      listOf(john, jane, jack).query {
        selectAll() groupBy { age }
      }.value() shouldBe Id(mapOf(30 to listOf(john), 32 to listOf(jane, jack)))
    }

    data class Student(val name: String, val age: Int)

    val john = Student("John", 30)
    val jane = Student("Jane", 32)
    val jack = Student("Jack", 32)
    val chris = Student("Chris", 40)

    "AQL is able to `groupBy`" {
      listOf(john, jane, jack).query {
        selectAll() where { age > 30 } groupBy { age }
      }.value() shouldBe mapOf(32 to listOf(jane, jack))
    }

    "AQL is able to `sum`" {
      listOf(john, jane, jack).query {
        selectAll() where { age > 30 } sum { age.toLong() }
      }.value() shouldBe 64L
    }

    "AQL is able to `order by Asc` simple selects" {
      listOf(1, 2, 3).query {
        select { this * 10 } order Ord.Asc(Int.order())
      }.value() shouldBe listOf(10, 20, 30)
    }

    "AQL is able to `order by Desc` simple selects" {
      listOf(1, 2, 3).query {
        select { this * 10 } order Ord.Desc(Int.order())
      }.value() shouldBe listOf(30, 20, 10)
    }

    "AQL is able to `order by Desc` simple selects with explicit instance" {
      listOf(1, 2, 3).query {
        select { this * 10 } order Ord.Desc(Int.order())
      }.value() shouldBe listOf(30, 20, 10)
    }


    "AQL is able to `groupBy` and then order `keys`"{
      listOf(john, jane, jack).query {
        selectAll() where { age > 30 } groupBy { age } orderMap Ord.Desc(Int.order())
      }.value() shouldBe mapOf(32 to listOf(jane, jack))
    }

    "AQL is able to `union`" {
      val queryA = listOf(
        "customer" to john,
        "customer" to jane
      ).select { this }
      val queryB = listOf(
        "sales" to jack,
        "sales" to chris
      ).select { this }
      queryA.union(queryB).value() shouldBe listOf(
        "customer" to john,
        "customer" to jane,
        "sales" to jack,
        "sales" to chris
      )
    }
  }
}

The initial DRAFT of AQL with just support for the core type classes is in. I have not added yet support for the rest of the modules to give enough time for the design to evolve before we are dependent on a ton of instances and code to get updated.

WIP prototype. The Arrow Query Language.
The Arrow query language offers SQL, Monoid comprehensions and LINQ
style queries generalized to all data types.
@@ -4,7 +4,6 @@ VERSION_NAME=0.8.0-SNAPSHOT
# Gradle options
org.gradle.jvmargs=-Xmx4g
# Kotlin configuration
kotlin.coroutines=enable

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

This is where this was

testCompile project(':arrow-test')
}
apply from: rootProject.file('gradle/gradle-mvn-push.gradle')

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Do we plan to have this out for 0.8.0?

This comment has been minimized.

@raulraja

raulraja Nov 1, 2018

Member

0.9.0, It's in early design phase

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

We could have it as a prototype without docs tbh. Or merge after the 0.8.0 cut.

@@ -0,0 +1,6 @@
package arrow.aql

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

we're calling these files syntax for now

this@value.from.value()
fun <Z, X> Query<ForId, Map<X, List<Z>>, Map<X, List<Z>>>.value(dummy: Unit = Unit): Map<X, List<Z>> =
this@value.from.value()

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

This this shouldn't be necessary.

import arrow.typeclasses.Foldable
import com.sun.org.apache.xpath.internal.operations.Bool
interface Sum<F> {

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

We already have a Sum in the codata package. Could it be renamed to Add? Or maybe we should rename the codata one, the name's not very descriptive.

import arrow.typeclasses.Applicative
import arrow.typeclasses.Functor
@extension interface EitherSelect<L> : Select<EitherPartialOf<L>> {

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Consistent formatting of @extension in the instances in this file

override fun applicative(): Applicative<ForEval> = Eval.applicative()
}
@extension interface EvalSelect : Select<ForEval> {

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Same :D

.value() shouldBe mapOf(32 to listOf(jane, jack))
}
// "AQL is able to `select count` and `groupBy`" {

This comment has been minimized.

@pakoito

This comment has been minimized.

@raulraja

@raulraja raulraja requested a review from arrow-kt/maintainers Nov 1, 2018

@@ -33,4 +33,10 @@ data class Id<out A>(val value: A) : IdOf<A> {
fun <A> just(a: A): Id<A> = Id(a)
}
override fun equals(other: Any?): Boolean =

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

What was the issue with the default implementation?

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Oh, right, you're comparing wrapped and unwrapped values.

This comment has been minimized.

@raulraja

raulraja Nov 1, 2018

Member

Id and the K Wrappers should all express their equality to the value they wrap or things can go wrong when comparing wrappers and wrapped values with ==. This surfaced as I was testing the DSL

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

We can probably fix this with inline classes. I expect them to be the same.

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Well, they aren't -.- Stupid nominal typing.

Still, we should re-encode them as inlined at some point.

@@ -3,6 +3,8 @@ package arrow.mtl.typeclasses
import arrow.Kind
import arrow.Kind2
import arrow.core.Tuple2
import arrow.core.curry

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Does it need curry?

@@ -33,6 +33,13 @@ data class ListK<out A>(val list: List<A>) : ListKOf<A>, List<A> by list {
return Eval.defer { loop(this) }
}
override fun equals(other: Any?): Boolean =
when (other) {

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

I don't disagree, but we should also do it for the other wrappers, and to be honest if we're using 1.3 we should just use inline classes instead.

@pakoito

pakoito approved these changes Nov 1, 2018

Awesome! It just works too, I can see people being happy with this :D Some notes here and there, as usual ;)

else None
})
}

This comment has been minimized.

@pakoito

pakoito Nov 1, 2018

Member

Could we add an alias here for Query<F, A, Z>.value()? Does it make sense?

@pakoito

One thing we could add to the DSL is a way of unpacking the value directly using Comonad<F>#extract()

@rafaparadela

This comment has been minimized.

Contributor

rafaparadela commented Nov 1, 2018

It would be beautiful to provide the ability for using FROM to define the source in the same order that an SQL query does. Something like:

query
  .select { it }
  .from(listOf(1, 2, 3))
  .where { it > 2 } shouldBe listOf(3)

I mean, the syntax might operate over queries instead over data.

@pakoito

This comment has been minimized.

Member

pakoito commented Nov 1, 2018

We also need a join or selectMany that uses flatMap behind the scenes.

@raulraja

This comment has been minimized.

Member

raulraja commented Nov 1, 2018

@pakoito join is already there and it delegates to Applicative#product.
@rafaparadela it is not possible to have from after select unless you explicitly ascribe the lambda in select or provide the entire set of type arguments to select<F, In, Out>:

select { it: Int -> ... }
  .from(listOf(1, 2, 3))
  .where { it > 2 }
@JorgeCastilloPrz

This comment has been minimized.

Member

JorgeCastilloPrz commented Nov 1, 2018

+1 to supporting From also as a Typeclass, since we'll need that for achieving human like language whenever we get infix notation ready. "select whatever from wherever"

Maybe we could make select be the mandatory initial operation with some refactoring, regardless of the syntax used (fluent vs infix). That would be ideal. Can't we enforce select to infer the types given the lambda?

)
}
fun Query<ForListK, Long, Long>.value(): Long =

This comment has been minimized.

@JorgeCastilloPrz

JorgeCastilloPrz Nov 1, 2018

Member

I assume the plan with value is to get a terminal operation enforcing the complete chain to resolve. Would make more sense to call it "run", "execute" or something similar?

This comment has been minimized.

@raulraja

raulraja Nov 1, 2018

Member

Yes, they should be renamed to run but in some cases the resulting value may be a pure expression for example when working with IO

@JorgeCastilloPrz

This comment has been minimized.

Member

JorgeCastilloPrz commented Nov 1, 2018

IMO this is potentially a powerful language for teaching FP concepts using "human" words (tutorials, courses ..etc). Good job on implementing those papers.

@raulraja

This comment has been minimized.

Member

raulraja commented Nov 1, 2018

@JorgeCastilloPrz we can't place select first without typing the Lambda or type arguments explicitly. This a limitation of inference in Kotlin which happens in order left to right.

@pakoito

This comment has been minimized.

Member

pakoito commented Nov 2, 2018

@pakoito join is already there and it delegates to Applicative#product.

I mean another, new one, that delegates to flatMap. It joins with a second set of values, created from one of the current one, so joinWith or something like that. In LINQ it's selectMany

raulraja added some commits Nov 2, 2018

Type class extension support for infix functions.
Enables polymorphic DSLs with optional infix notation:
select {..} from {..}
@@ -10,7 +10,7 @@ interface GroupBy<F> {
fun foldable(): Foldable<F>
infix fun <A, Z, X> Query<F, A, Z>.groupBy(group: (Z) -> X): Query<ForId, Map<X, List<Z>>, Map<X, List<Z>>> =
infix fun <A, Z, X> Query<F, A, Z>.groupBy(group: Z.() -> X): Query<ForId, Map<X, List<Z>>, Map<X, List<Z>>> =

This comment has been minimized.

@pakoito

pakoito Nov 5, 2018

Member

This is something I thought at some point could be more widespread. I'm not sure about the consequences...

@@ -21,6 +21,9 @@ interface Select<F> {
infix fun <A, Z> Source<F, A>.select(f: Selection<A, Z>): Query<F, A, Z> =
Query(f, this)
fun <A> Source<F, A>.selectAll(): Query<F, A, A> =

This comment has been minimized.

@pakoito

pakoito Nov 5, 2018

Member

It'll probably feel better as a field: selectAll where { this > 2 }

This comment has been minimized.

@raulraja

raulraja Nov 5, 2018

Member

We can, but then we have to update first @extension to consider properties in addition to functions in type classes. Not sure how hard that would be but definitely something we can change

@codecov

This comment has been minimized.

codecov bot commented Nov 5, 2018

Codecov Report

Merging #1079 into master will decrease coverage by 0.2%.
The diff coverage is 32.87%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1079      +/-   ##
============================================
- Coverage     42.31%   42.11%   -0.21%     
- Complexity      755      767      +12     
============================================
  Files           365      386      +21     
  Lines         10220    10483     +263     
  Branches       1146     1182      +36     
============================================
+ Hits           4325     4415      +90     
- Misses         5560     5730     +170     
- Partials        335      338       +3
Impacted Files Coverage Δ Complexity Δ
...main/kotlin/arrow/mtl/typeclasses/FunctorFilter.kt 33.33% <ø> (ø) 0 <0> (ø) ⬇️
...cessor/src/main/java/arrow/meta/encoder/MetaApi.kt 0% <ø> (ø) 0 <0> (ø) ⬇️
...uage/src/main/kotlin/arrow/aql/instances/either.kt 0% <0%> (ø) 0 <0> (?)
...l/src/main/kotlin/arrow/mtl/instances/sequencek.kt 0% <0%> (ø) 0 <0> (?)
...nguage/src/main/kotlin/arrow/aql/instances/eval.kt 0% <0%> (ø) 0 <0> (?)
...src/main/java/arrow/meta/encoder/jvm/JvmMetaApi.kt 0% <0%> (ø) 0 <0> (ø) ⬇️
...e/src/main/kotlin/arrow/aql/instances/function0.kt 0% <0%> (ø) 0 <0> (?)
...cessor/src/main/java/arrow/meta/decoder/Decoder.kt 0% <0%> (ø) 0 <0> (ø) ⬇️
...anguage/src/main/kotlin/arrow/aql/instances/try.kt 0% <0%> (ø) 0 <0> (?)
...ions-processor/src/main/java/arrow/meta/ast/ast.kt 0% <0%> (ø) 0 <0> (ø) ⬇️
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1472fa5...c5fcc15. Read the comment docs.

raulraja and others added some commits Nov 5, 2018

@raulraja raulraja changed the title from WIP. The Arrow Query Language. to The Arrow Query Language. Nov 9, 2018

@raulraja raulraja requested a review from arrow-kt/maintainers Nov 9, 2018

raulraja added some commits Nov 9, 2018

@raulraja

This comment has been minimized.

Member

raulraja commented Nov 9, 2018

I'll address some of your comments in other PRs. We need to primarily do an overhaul on names and make sure it all makes sense before 1.0.0 and many of the concerns here are related to that. As for query extensions on combinators like .value() we need to provide at a global level one for each terminal type which is Kind, List, Map, Id and Long so far.

copy(
modifiers = emptyList(),
modifiers = modifiers.filterNot { it !in keepModifiers },

This comment has been minimized.

@leandroBorgesFerreira

leandroBorgesFerreira Nov 10, 2018

Collaborator

modifiers.filter { it in keepModifiers }?

}
fun Query<ForListK, Long, Long>.value(): Long =
foldable().run {

This comment has been minimized.

@leandroBorgesFerreira

leandroBorgesFerreira Nov 10, 2018

Collaborator

Is there a need to change scope to Foldable? This run is just changing scope right?

foldable().run {
val la: ListK<Z> = from.foldMap(ListK.monoid()) { listOf(select(it)).k() }
val lb: ListK<Z> = query.from.foldMap(ListK.monoid()) { listOf(query.select(it)).k() }
val result: ListK<Z> = la.combineK(lb).k()

This comment has been minimized.

@leandroBorgesFerreira

leandroBorgesFerreira Nov 10, 2018

Collaborator

No need for k(). Could be just val result: ListK<Z> = la.combineK(lb)

This comment has been minimized.

@raulraja

raulraja Nov 11, 2018

Member

result has to be a Kinded value and List on its own isn't.

This comment has been minimized.

@leandroBorgesFerreira

leandroBorgesFerreira Nov 11, 2018

Collaborator

You're right. I thought the List combineK would return ListK instead of List. 👍

raulraja added some commits Nov 11, 2018

}
fun <Z, X> Query<ForListK, Map<X, List<Z>>, Map<X, List<Z>>>.value(): Map<X, List<Z>> =
foldable().run {

This comment has been minimized.

@leandroBorgesFerreira

leandroBorgesFerreira Nov 11, 2018

Collaborator

No need for the foldable().run.

Could be just:

fun <Z, X> Query<ForListK, Map<X, List<Z>>, Map<X, List<Z>>>.value(): Map<X, List<Z>> =
    from.fix().firstOrNone().getOrElse { emptyMap() }
)
}
fun Query<ForId, Long, Long>.value(): Long =

This comment has been minimized.

@leandroBorgesFerreira

leandroBorgesFerreira Nov 11, 2018

Collaborator

No need for the foldable().run { could be

fun Query<ForId, Long, Long>.value(): Long = from.value()

@raulraja raulraja merged commit 6192fb2 into master Nov 11, 2018

2 of 4 checks passed

codecov/patch 32.87% of diff hit (target 42.31%)
Details
codecov/project 42.11% (-0.21%) compared to 1472fa5
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@raulraja raulraja deleted the rr-arrow-query-language branch Nov 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment