New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregations #218

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
5 participants
@thobe
Contributor

thobe commented Apr 13, 2017

Aims to solve #188.

CIP2017-04-13

@thobe

This comment has been minimized.

Contributor

thobe commented Apr 13, 2017

This is still an early draft.

ToDo:

  • Handle ordering of aggregated data.
  • Handle parameterized aggregation.
@Mats-SX

This comment has been minimized.

Member

Mats-SX commented Apr 18, 2017

Related CIR: #219

.Aggregation using `collect`
----
MATCH (person:Person)-[:FRIEND]-(friend)
RETURN person.email, collect OF friend {.name,.email} AS friends

This comment has been minimized.

@Mats-SX

Mats-SX Apr 18, 2017

Member

I like the OF syntax, but collect is the operation I would like to change if we proceed with OF. collect is a verb, but a noun fits better here, which makes me think we should rename this to collection, or even list:

MATCH (person:Person {name: $name})
RETURN list OF person.age

This comment has been minimized.

@technige

technige Jul 21, 2017

Contributor

Having just read through, I was thinking exactly the same thing, both "nounifying" the term and switching to list. It strikes me that this also highlights the return type of the operation.

[source, cypher]
.Aggregation using `percentileCont`
----
BREAKS DOWN IN THIS SYNTAX

This comment has been minimized.

@Mats-SX

Mats-SX Apr 18, 2017

Member

Perhaps we could consider allowing arguments to the aggregator on the left-hand side of the OF?

MATCH (n)
RETURN percentileCont($percentile) OF n.property

This comment has been minimized.

@petraselmer

petraselmer Apr 18, 2017

Contributor

Nice ^^

How about - so we have a few options to give food for thought - a parentheses-free version:

MATCH (n)
RETURN percentileCont OF n.property GIVEN $percentile

I also played around with using WITH instead of GIVEN, but that is way too overloaded.

If we ever have aggregations with more than one argument of this type, e.g. myAgg(expression, arg1, arg2) in today's syntax, these are the options we'd have (continuing with the same examples as above):

  1. ... RETURN myAgg($arg1, @arg2) OF n.property
  2. ... RETURN myAgg OF n.property GIVEN $arg1, @arg2

This comment has been minimized.

@thobe

thobe Apr 18, 2017

Contributor

Yes, I thought about similar things on my way home after having pushed this. What I thought of then was that, at least in this case, the parameter describes which particular aggregate value to get, so perhaps a subscript operator would be appropriate RETURN percentileCont[0.4] OF n.property

This comment has been minimized.

@petraselmer

petraselmer Apr 18, 2017

Contributor

So, the general - or multi-arg - case would be RETURN myAgg[$arg1, $arg2]? Or, is this not really 'general-izable'?

This comment has been minimized.

@technige

technige Jul 21, 2017

Contributor

Could you bracket the expression on the RHS of OF? For example:

RETURN map OF (key, value)

This comment has been minimized.

@Mats-SX

Mats-SX Jul 21, 2017

Member

@technige Interesting idea. Are you considering the case when both key and value are parameters that change across records, or when only one of them are?

I kind of like using OF as a separator between arguments that are read once versus arguments that are the actual substance of the aggregation.

[source, cypher]
.Aggregation using `count`
----
MATCH (nodes) RETURN count OF nodes

This comment has been minimized.

@Mats-SX

Mats-SX Apr 18, 2017

Member

For count(*), are you considering count OF *?

This comment has been minimized.

@thobe

thobe Apr 18, 2017

Contributor

I had not thought of that, but that seems sensible.

This comment has been minimized.

@boggle

boggle May 8, 2017

Contributor

I've been wondering if we shouldn't just mandate always writing a kind-of list comprehension syntax for aggregation, to make more visible what's going on

collect [ n.prop | * ]
max [ n.weight | * ]

this would allow re-using aggregation function calling syntax with regular lists!

count [ n | n IN [ 1, 2, 3, 4 ] ]

or even shorter

count [ n | 1, 2, 3, 4 ]

I agree, we should just use aggrF(args) ... to pass in args to the aggregation function.

But I think it's important to have syntax that allows us to call aggregation functions for aggregations as well as over ordinary lists.

This comment has been minimized.

@Mats-SX

Mats-SX May 8, 2017

Member

That syntax seems pretty far away from what this CIP suggests. Could we adapt it to a better fit?

Just allow any expression after the OF, where only property expressions and expressions that evaluate to a list are valid:

max OF n.weight 
max OF [1, 2, 3, 4] // semantics depend on expression value

Although this is not ideal when considering list properties...

This comment has been minimized.

@technige

technige Jul 21, 2017

Contributor

Some similar ideas are discussed in the Python generator expression PEP -> https://www.python.org/dev/peps/pep-0289/

@Mats-SX

This comment has been minimized.

Member

Mats-SX commented Apr 18, 2017

Also related to #220

@petraselmer

This comment has been minimized.

Contributor

petraselmer commented Apr 18, 2017

I think more examples showing the interplay between aggregations and other functions would be good, along with showing nested operations - pretty much along the lines of the examples shown in #219

@petraselmer petraselmer added the oCIG label May 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment