New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] extend language integrated query clauses with "apply" clause #3571

Closed
paulomorgado opened this Issue Jun 18, 2015 · 21 comments

Comments

Projects
None yet
6 participants
@paulomorgado
Copy link

paulomorgado commented Jun 18, 2015

Every now and then there's a new request for a new query clauses (C#, Visual Basic).

Sometimes it's to add to C# some query clause that Visual Basic already has (Aggregate, Distinct, Skip, Skip While, Take, Take While), or materialization of the query (ToArray, ToList, ToDictionary or ToLookup) or soe other selection or aggregation (First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault, Count, etc.).

There are just too many and chances are that more will come.

How about extending the query language with an apply clause?

This apply clause would be composed of an apply keyword followed by the method name (instance or extension like already happens to Select or Where) and any other parameters.

Then it would be possible to write something like this:

from c in customers
apply Distinct c.CountryID
select c.Country.Name

Or this:

from o in customer.Orders
select o
apply FirstOrDefault

Or this:

from c in customers
select c
apply ToDictionary c.ID

I don't know if I like this or not, but looks to me a lot more consistent and all-encompassing than individual proposals.

@svick

This comment has been minimized.

Copy link
Contributor

svick commented Jun 18, 2015

For the record,the previous proposals on GitHub are #3486 and #100 (and I'm sure there were some discussions on CodePlex too).

@svick

This comment has been minimized.

Copy link
Contributor

svick commented Jun 18, 2015

I would really like something like this, since it's completely general. But I think the syntax needs to be fleshed out much more. Some questions:

  • How do you apply methods with multiple parameters (like the two-selector overload of ToDictionary())?

    The variants consistent with this proposal are apply ToDictionary c.ID c.Name and apply ToDictionary c.ID, c.Name, but personally I don't like either. Instead, I would prefer call-like syntax for all methods: apply ToDictionary(c.ID, c.Name) and e.g. apply First().

  • When no range variable is mentioned, how do you differentiate between normal parameters (e.g. apply Take(10)) and lambda parameters (e.g. the second parameter of apply ToDictionary(c.ID, 0))?

    F# LINQ (which already has similar kind of extensibility) uses attributes for this. Another option would be to use the variant that compiles, while preferring one of them when both compile (though that might easily lead to confusing error messages).

  • How do you differentiate between methods that maintain range variables and so can be used in the middle of a query (like Distinct) and those that don't and so have to be used after the final select (like First)?

    F# LINQ again uses attributes for this. And another option would be to just assume the user is right, but I think that would require lots of care to make the error messages understandable (instead of something like "Could not find an implementation of the query pattern for source type 'int'. 'Select' not found." for the query from i in list apply First select 2*i).

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 18, 2015

'Apply' could be very good starting point for discussion.

Call-like syntax is maybe more suitable for C# language.

from c in customers
select c
apply ToDictionary c.ID

from c in customers
select c
apply ToDictionary(c.ID)

When there are not parameters, is it good idea to allow both forms?

from o in customer.Orders
select o
apply FirstOrDefault

from o in customer.Orders
select o
apply FirstOrDefault()

Some methods are natural after the final select.

... select expression apply ToList/ToArray/ToDictionary/...

Some methods can be after the final select.

... select expression apply FirstOrDefault/First/Single/...

But also can be proposed in infix notation.

...select FirstOrDefault/First/Single/... expression

Some methods are maybe more natural in infix notation (as in sql).

...select Distinct expression

But also can be written on the end

...select expression apply Distinct
@paulomorgado

This comment has been minimized.

Copy link
Author

paulomorgado commented Jun 18, 2015

Not all methods of Enumerable are available through LINQ clauses. So, for the lack of better syntax, I intentionally left out those not translatable to this syntax.

I intentionally did not use a call-like syntax because that's how all existing clauses were defined.

All the rules that apply now and to method invocation style would still apply. The fact that you don't have a TResult Select<TSource, TResult>(TSource source, Func<TSource, TResult> selector) where source and result are not enumerables doesn't mean you can't have. In fact, in the last few years, I've seen Bart de Smet twisting LINQ in very interesting ways.

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 20, 2015

So, for the lack of better syntax, I intentionally left out those not translatable to this syntax.

Can we make a minimum subset of methods that are suitable for implementation?

... select expression apply ToList
... select expression apply ToArray
... select expression apply ???

Is it good idea for the first time to avoid methods with parameters?

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 20, 2015

@paulomorgado

Can you explain this example.

from c in customers
apply Distinct c.CountryID
select c.Country.Name

I don't understand how this query can be translated into pure methods chain. Does it mean

(from c in customers select c.Country.Name).Distinct()

Maybe you thought

from c in customers
select c.Country.Name 
apply Distinct

When we talk about Distinct we should have in mind how works default comparator.

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 20, 2015

More precisely, I don't understand how to use 'apply' before 'select expression'.

Now we can write 'where' more times between first 'from' and the last 'select'.
For example, I usually write 'let' between two 'where'.

from ...
...
where condition
...
where condition
...
where condition
...
select expression

Do we think about this form of 'apply'?


from ...
...
apply Method
...
apply Method
...
apply Method
...
select expression
apply Method

Can we make a list of suitable methods in this case?
Is it good approach?

@paulomorgado

This comment has been minimized.

Copy link
Author

paulomorgado commented Jun 20, 2015

@gordanr,

So, for the lack of better syntax, I intentionally left out those not translatable to this syntax.

Can we make a minimum subset of methods that are suitable for implementation?

... select expression apply ToList
... select expression apply ToArray
... select expression apply ???

Is it good idea for the first time to avoid methods with parameters?

The point of this proposal is to not be limited to any set of methods. And to allow methods with none or one parameter.

Can you explain this example.
    from c in customers
    apply Distinct c.CountryID
    select c.Country.Name

I don't understand how this query can be translated into pure methods chain. Does it mean
    (from c in customers select c.Country.Name).Distinct()

Maybe you thought
    from c in customers
    select c.Country.Name
    apply Distinct

When we talk about Distinct we should have in mind how works default comparator.

This:

from c in customers
apply Distinct c.CountryID
select c.Country.Name

will translate into:

customers.Distinct(c => c.CountryID).Select(c => c.Country.Name)

On the other hand, this:

from c in customers
select c.Country.Name 
apply Distinct

will translate into:

customers.Select(c => c.Country.Name).Distinct()

More precisely, I don't understand how to use 'apply' before 'select expression'.

Now we can write 'where' more times between first 'from' and the last 'select'.
For example, I usually write 'let' between two 'where'.
    from ...
    ...
    where condition
    ...
    where condition
    ...
    where condition
    ...
    select expression

Do we think about this form of 'apply'?

    from ...
    ...
    apply Method
    ...
    apply Method
    ...
    apply Method
    ...
    select expression
    apply Method

Can we make a list of suitable methods in this case?
Is it good approach?

apply is used as any other clause with the only difference that the first operand is a method.

You can write this:

from ...
...
where condition
...
where condition
...
where condition
...
select expression

as this:

from ...
...
apply Where condition
...
apply Where condition
...
apply Where condition
...
apply Select expression

if you want to.

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 20, 2015

Thank you. Very, very interesting. I will think more about that.

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

That was a key.

apply Where condition

Now, I understand better your proposal.

I intentionally did not use a call-like syntax because that's how all existing clauses were defined.

Yes. Now when I understand I agree with you regarding parameters.

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

This is a piece of code from one of my old WinForms application.

List<int> awardsIds = panelAwards.Controls.OfType<CheckBox>()
                     .Where(c => c.Checked)
                     .Select(c => (int)c.Tag).ToList();

Now we can write

var awardIds = from c in panelAwards
               apply OfType<CheckBox>
               where c.Checked  // CheckBox has property Checked.
               select (int)c.Tag
               apply ToList;

Very nice.

@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

Element operations have Immediate Execution. These methods are always on the end of query or subquery.

apply ElementAt n
apply ElementAtOrDefault n
apply First
apply FirstOrDefault 
apply Last
apply LastOrDefault
apply Single
apply SingleOrDefault
var x = from ... 
        where ...
        select expression 
        apply First;

Or in some form of possible infix notation. This notation can coexist with apply.

var x = from ... 
        where ...
        select First expression 
     // select first expression 
@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

This group of my favourite methods have Immediate Execution. These methods are always on the end of query or subquery.

ToArray, ToDictionary, ToList, ToLookup

var r = from n in numbers
        where n % 2 == 0
        select n
        apply ToList;

var r = from n in numbers
        where n % 2 == 0
        apply ToList;

var r = from n in numbers
        where n % 2 == 0
        select n 
        as List;
Dictionary<int,Order> orders =  
    customers.SelectMany(c => c.Orders)
    .Where(o => o.OrderDate.Year == 2005).ToDictionary(o => o.OrderId);

var orders = from c in customers
             from o in c.Orders
             where o.OrderDate.Year == 2005
             apply ToDictionary o.OrderId
@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

Methods All and Any have Immediate Execution. These methods are always on the end of query or subquery.

Here is one VB example.

Dim query = From pers In people 
            Where (Aggregate pt In pers.Pets Into All(pt.Age > 2)) 
            Select pers.Name

Dim query = From pers In people 
            Where (Aggregate pt In pers.Pets Into Any(pt.Age > 7)) 
            Select pers.Name

And some F#.

query {
    for student in db.Student do
    where (query { for courseSelection in db.CourseSelection do
                   exists (courseSelection.StudentID = student.StudentID) })
    select student
}

I am not convinced that it is good to introduce special form of syntax for this purpose which exists in VB.

var query = from pers in people 
            where (from pt In pers.Pets select pet apply Any pt.Age > 7) 
         // where (from pt In pers.Pets apply Any pt.Age > 7) 
            select pers.Name
@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

These methods have Immediate Execution. These methods are always on the end of query or subquery.

Here is VB syntax.

Dim avg = Aggregate temp In temperatures Into Average()

Dim highTemps As Integer = Aggregate temp In temperatures Into Count(temp >= 80)

Dim numTemps As Long = Aggregate temp In temperatures Into LongCount()

Dim maxTemp = Aggregate temp In temperatures Into Max()

Dim minTemp = Aggregate temp In temperatures Into Min()

Dim orderTotal = Aggregate order In orders Into Sum(order.Amount)

And some F#

query {
    for student in db.Student do
    averageByNullable (Nullable.float student.Age)
    }

query {
    for student in db.Student do
    averageBy (float student.StudentID)
}

query {
   for student in db.Student do
   sumBy student.StudentID
   }

let student =
    query {
        for student in db.Student do
        maxBy student.StudentID
    }

We can avoid special form of Aggregate syntax that exists in VB.

Dim customerMax = From cust In customers
                  Aggregate order In cust.Orders Into MaxOrder = Max(order.Amount)
                  Select cust.CompanyName, MaxOrder

var customerMax = from cust in customers
                  let maxOrder = from o cust.Orders select o.Amount apply Max
               // let maxOrder = from o cust.Orders apply Max o.Amount
                  select cust.CompanyName, MaxOrder

Some possible infix improvements that can coexist with apply.

var customerMax = from cust in customers
               // let maxOrder = from o cust.Orders select Max o.Amount // infix form
               // let maxOrder = from o cust.Orders select max o.Amount // infix form, deep integration
                  select cust.CompanyName, MaxOrder
@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

Here is VB syntax.

Dim query = From word In words Skip 4

Dim query = From word In words Skip While word.Substring(0, 1) = "a" 

Dim query = From word In words Take 2

Dim query = From word In words Take While word.Length < 6

And some F#

 query {
    for number in data do
    skipWhile (number < 3)
    select student
    }

C# Examples

var numbersSubset = numbers.Take(5).Skip(4);

from n in numbers
select n
apply Take 5
apply Skip 4

from n in numbers
select n
take 5 // possible syntax in the future?
skip 4
var remaining = numbers.SkipWhile(n => n < 9);

var remaining = from n in numbers
                apply SkipWhile n < 9
                select n

var remaining = from n in numbers
                skipWhile n < 9 // possible syntax in the future? deep integration?
                select n
var result = products.OrderByDescending(p => p.UnitPrice).Take(10);

var result = from p in products
             orderby p.UnitPrice descending
             apply Take 10;

var result = from p in products
             orderby p.UnitPrice descending
             take 10; // possible syntax in the future?
@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

VB example

Dim distinctQuery = From grade In classGrades Select grade Distinct

C# example

IEnumerable<string> productCategories = products.Select(p => p.Category).Distinct();

var productCategories = from p in products
                        select p.Category
                        apply Distinct;

Or some possible alternatives.

var productCategories = from p in products
                        select p.Category Distinct;
                     // select p.Category distinct;
                     // select distinct p.Category;
                     // select Distinct p.Category;

@paulomorgado I am not sure that method Distinct has correct syntax.

customers.Distinct(c => c.CountryID).Select(c => c.Country.Name)
@gordanr

This comment has been minimized.

Copy link

gordanr commented Jun 21, 2015

Some examples in C# and F#

char[] apple = { 'a', 'p', 'p', 'l', 'e' };
char[] reversed = apple.Reverse().ToArray();
from c in apple
select c
apply Reverse
apply ToArray;
from c in apple
select c
apply Reverse
as Array; // That's my proposal for materialization with 'as'.
let b =
    query {
        for student in db.Student do
        select student.Age
        contains 11
    }
bool b = from student in db.Student
         select student.Age
         apply Contains 11
@paulomorgado

This comment has been minimized.

Copy link
Author

paulomorgado commented Jun 21, 2015

@gordanr,

@paulomorgado I am not sure that method Distinct has correct syntax.
customers.Distinct(c => c.CountryID).Select(c => c.Country.Name)

I was thinking of some of my extensions - LINQ: Enhancing Distinct With The SelectorEqualityComparer

@weitzhandler

This comment has been minimized.

Copy link

weitzhandler commented Dec 16, 2015

#100 (comment)

I strongly vote for this one.

I never use LINQ language queries and instead my coding guidelines is avoid using them and always use the extension methods directly, and that's because the ugly parentheses each query has to be wrapped in order to materialize it (ToArray, ToList, Sum, SingleOrDefault etc.).

Until this issue is addressed, language-build-in LINQ is merely useless.
I really hope to see this implemented soon. Maybe to a more limited extent (avoiding the introduction of a gazillion new language keywords.

I'd say the syntax should provide an operator that expects a shortcutted extension method available for IEnumerable<TElement>, for instance:

//parents is Parent[]
var parents = from student in students
                     where student.Age < 18
                     select student.Parent
                     call ToArray()

//student is Student
var student = from st in students
                      call SingleOrDefault(st => st.Id == id);

Asynchronous methods should also be supported:

   var student = from st in students
                         call await SingleOrDefaultAsync(st => st.Id == id);

Maybe there should be a verbose LINQ fashioned way to pass the arguments and completely avoid the use of parenthesis, but I personally don't see it as necessary.

Anyway this feature is crucial for the completeness of the LINQ syntax.

Some suggestions above proposed the ToList at the beginning of the query, but that's a no go, since we want to be able to process it after the selection, and we don't want to be limited to parameterless ex. methods only. What if we wanna call ToLookup with a key selector, bottom line we can't tie it up to the language, we just need to find a way to call any Enumerable ex. methods on the current query state and treat its result as the final type of the query.

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Mar 24, 2017

Issue moved to dotnet/csharplang #333 via ZenHub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment