Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filters: Aggregations #924

Closed
michaelstaib opened this issue Jul 18, 2019 · 32 comments
Closed

Filters: Aggregations #924

michaelstaib opened this issue Jul 18, 2019 · 32 comments
Assignees
Labels
Area: Data Issue is related to filtering, sorting, pagination or projections 🎉 enhancement New feature or request 🌶️ hot chocolate
Milestone

Comments

@michaelstaib
Copy link
Member

michaelstaib commented Jul 18, 2019

Scroll down for the description :)

@PascalSenn
Copy link
Member

Aggregations

This proposal is very Work In Progress

Introduction

Aggregations are very useful for querying big data sources. To count items or sum a value, it's not necessary to load all the data. Querying already aggregated data in the backend has less overhead.

Proposal

A new API Endpoint is generated for a collection. The type of this endpoint has different aggregation fields on them.

It will not be possible to cover all aggregation cases. We simply cannot create types for all possible scenarios.

Proposed are the following:

Example Objects

All examples will take place in the following domain:

public class Foo {
     public string role {get;set;}
     public int num {get;set;}
     public Bar bars {get;set;}
}

public class Bar {
     public string baz {get;set;}
     public int blub {get;set;}
}

count

{
  foosMeta {
    count
}

returns the amount of foo's in foos.

sum

{
  foosMeta {
    sum_num
}

returns the sum of all nums.

average

{
  foosMeta {
    sum_average
}

returns the average of all nums.

group_by

Example

{
  foosMeta {
    group_by {
      role {
        key
        value {
          bar {
            baz
          }
        }
      }
    }
  }
}

How others did it

Prisma (Issue prisma/prisma#1312)

# Count all posts with a title containing 'GraphQL'
query {
  postsConnection(where: {
    title_contains: "GraphQL"
  }) {
    aggregate {
      count
    }
  }
}
{
  allUsersConnection(where: {city: "Aarhus"}) {
    aggregate {
      avg {
        age
      }
      max {
        age
      }
    }
    edges {
      node { name, age }
    }
  }
}
{
  allUsersConnection(where: {city: "Aarhus"}) {
    aggregate {
      avg {
        age
      }
      max {
        age
      }
    }
    edges {
      node { name, age }
    }
  }
}

Horsura

query {
  article_aggregate {
    aggregate {
      count
      sum {
        rating
      }
      avg {
        rating
      }
      max {
        rating
      }
    }
    nodes {
      id
      title
      rating
    }
  }
}
query {
  author (where: {id: {_eq: 1}}) {
    id
    name
    articles_aggregate {
      aggregate {
        count
        avg {
          rating
        }
        max {
          rating
        }
      }
      nodes {
        id
        title
        rating
      }
    }
  }
}

Gatsby

### no example found

GraphCMS

### no example found

@angrymrt
Copy link

In your average example, I guess you mean average_num, instead of sum_average. How would this be configured on the c# side of things?

@PascalSenn
Copy link
Member

@angrymrt You are right it would be average_num. This is just a proposal. I currently do not longer thing the meta endpoint makes that much sense
It definitely makes sense to integrate this into connections like Prisma does it.
But I also think we can aggregate endpoints at different places

We usually first work on the graphql API specification before we do the code API.
Probably via descriptor and type.
Also with UseSelection we would optimize these aggregations over IQueryable directly in the database.
At the moment I could think of Expressions that we then rip apart and recompile.

{
  allUsersConnection(where: {city: "Aarhus"}) {
    aggregate {
      avg {
        age
      }
      max {
        age
      }
    }
    edges {
      node { name, age }
    }
  }
}

@michaelstaib
Copy link
Member Author

I currently think also that we could go down the rout of lodash for aggregations. We definitely will add support in the client for that.

@michaelstaib
Copy link
Member Author

@zliebersbach
Copy link

Hi all, just wondering what the status of this issue is, and if there hasn't been any progress I would love to offer some assistance. I just need to be pointed in the right direction 😄

@PascalSenn
Copy link
Member

Hi @kkliebersbach
There was indeed no progress. we were playing around internally a bit but nothing made it to a contribution. Which approach would you want to take? the directive one or the type version on the connections?

@XardasLord
Copy link

I'm really waiting for this feature.

@shawndube
Copy link

I have to implement some kind of aggregation in the next 2 week and was going to try and do the Hasura syntax. @PascalSenn do you have any guidance on how to even define the ObjectTypes for that from a HotChocolate perspective? All my other current code is a code-first implementation so I'd like to stick with that if possible but its not clear to me how to define it. Did you maybe start a prototype that is available to look at?

@PascalSenn
Copy link
Member

@shawndube You can have a look at the Data project. Filtering is a good example.
We usually implement everything in a really generic way. If you are going to implement this for a 1 time usecase i recommend you to just use normal object types with the fields on them that you need. It is probably a lot easier

@shakibstu
Copy link

shakibstu commented Jan 31, 2021

@PascalSenn we are seriously waiting for that feature

@gojanpaolo
Copy link

Cross-posting my comment in #1260 (comment) which is also for aggregations but for sorting

Is there a way to have a graphql sort using aggregate of list field that is similar to the following C# code?

var sessionsWithAttendeesFirst = sessions.OrderByDescending(s => s.Attendees.Any());
var sessionsByMostAttendees = sessions.OrderByDescending(s => s.Attendees.Count());

I'm thinking something like this in graphql

sessions (order: { attendees: { any: DESC } } )
sessions (order: { attendees: { count: DESC } } )

@tjolr
Copy link

tjolr commented Feb 26, 2021

@PascalSenn do you know approximately when we can expect this feature to be implemented? E.g: is it closer to 1 month or 1 year?
Really looking forward to it! :-)

@michaelstaib
Copy link
Member Author

@tjolr this feature can be easily done on your own with type extensions or type interceptors.

We have put aggregates towards the end of this year since there is really not a so big need to have this in the sense that it is easy to build this on your own.

However, we will do it and have scheduled it for the November release. No promises, it might be moved to an earlier or later release. We in general look at feedback of the community to determine the need of a feature.

What we could however do is write up a blog post outlining how one could implement this. @PascalSenn what do you think?

@michaelstaib michaelstaib modified the milestones: HC-11.x.x, HC2021-11 Feb 27, 2021
@tjolr
Copy link

tjolr commented Mar 5, 2021

@michaelstaib a blog post about how this can be done would be good! 👍 But we can't wait for the official implementation of the data aggregation as well!! :-) 💯

@shakibstu
Copy link

@tjolr this feature can be easily done on your own with type extensions or type interceptors.

We have put aggregates towards the end of this year since there is really not a so big need to have this in the sense that it is easy to build this on your own.

However, we will do it and have scheduled it for the November release. No promises, it might be moved to an earlier or later release. We in general look at feedback of the community to determine the need of a feature.

What we could however do is write up a blog post outlining how one could implement this. @PascalSenn what do you think?

It would be great 👌🏻🤗

@gojanpaolo
Copy link

gojanpaolo commented Mar 5, 2021

We had implemented a custom operation handler for one of our collection types. I'm hoping this could help others develop custom aggregations while waiting for the write up :)

Essentially we're trying to recreate the following C# method

employees.Where(e => 
    e.Times
    .OrderByDescending(t => t.OutTimestamp ?? t.InTimestamp) // order by latest
    .FirstOrDefault() // get latest
    ?.Distance == 5)

This is the graphql query we'd like to have

employees(where: { times: { latest: { distance: { eq: 5 } } } })

# basically, we could do any filter operation on { times: { latest: { ... } } }

We we're able to implement the above like so:

public static class CustomOperations 
{
    public const int Latest = 1025;
}
public class CustomFilteringConvention : FilterConvention
{
   protected override void Configure(IFilterConventionDescriptor descriptor)
    {
        descriptor.AddDefaults();

        descriptor
            .Configure<ListFilterInputType<TimeFilterInputType>>(_ => _
                .Operation(CustomOperations.Latest)
                .Type<TimeFilterInputType>())
            .AddProviderExtension(
                new QueryableFilterProviderExtension(_ => _
                    .AddFieldHandler<TimeQueryableListLatestOperationHandler>()))
            .Operation(CustomOperations.Latest)
            .Name("latest");
    }
}
public class TimeQueryableListLatestOperationHandler : FilterOperationHandler<QueryableFilterContext, Expression>
{
    public override bool CanHandle(ITypeCompletionContext context, IFilterInputTypeDefinition typeDefinition, IFilterFieldDefinition fieldDefinition)
        => fieldDefinition is FilterOperationFieldDefinition { Id: CustomOperations.Latest };

    public override bool TryHandleEnter(QueryableFilterContext context, IFilterField field, ObjectFieldNode node, [NotNullWhen(true)] out ISyntaxVisitorAction? action)
    {
        Expression<Func<ICollection<Time>, Time?>> expression = _ => _
            .OrderByDescending(_ => _.OutTimestamp ?? _.InTimestamp) // order by latest
            .FirstOrDefault(); // get latest
        var invoke = Expression.Invoke(expression, context.GetInstance());
        context.PushInstance(invoke);
        action = SyntaxVisitor.Continue;
        return true;
    }
}

// in Startup.ConfigureServices
services
.AddGraphQLServer()
.AddFiltering<CustomFilteringConvention>()

Our implementation is not the most generic solution but I think it can be used as a guide on how to implement a custom list operation / aggregation. Hope this helps! Cheers! 🍺

@shakibstu
Copy link

We had implemented a custom operation handler for one of our collection types. I'm hoping this could help others develop custom aggregations while waiting for the write up :)

Essentially we're trying to recreate the following C# method

employees.Where(e => 
    e.Times
    .OrderByDescending(t => t.OutTimestamp ?? t.InTimestamp) // order by latest
    .FirstOrDefault() // get latest
    ?.Distance == 5)

This is the graphql query we'd like to have

employees(where: { times: { latest: { distance: { eq: 5 } } } })

# basically, we could do any filter operation on { times: { latest: { ... } } }

We we're able to implement the above like so:

public static class CustomOperations 
{
    public const int Latest = 1025;
}
public class CustomFilteringConvention : FilterConvention
{
   protected override void Configure(IFilterConventionDescriptor descriptor)
    {
        descriptor.AddDefaults();

        descriptor
            .Configure<ListFilterInputType<TimeFilterInputType>>(_ => _
                .Operation(CustomOperations.Latest)
                .Type<TimeFilterInputType>())
            .AddProviderExtension(
                new QueryableFilterProviderExtension(_ => _
                    .AddFieldHandler<TimeQueryableListLatestOperationHandler>()))
            .Operation(CustomOperations.Latest)
            .Name("latest");
    }
}
public class TimeQueryableListLatestOperationHandler : FilterOperationHandler<QueryableFilterContext, Expression>
{
    public override bool CanHandle(ITypeCompletionContext context, IFilterInputTypeDefinition typeDefinition, IFilterFieldDefinition fieldDefinition)
        => fieldDefinition is FilterOperationFieldDefinition { Id: CustomOperations.Latest };

    public override bool TryHandleEnter(QueryableFilterContext context, IFilterField field, ObjectFieldNode node, [NotNullWhen(true)] out ISyntaxVisitorAction? action)
    {
        Expression<Func<ICollection<Time>, Time?>> expression = _ => _
            .OrderByDescending(_ => _.OutTimestamp ?? _.InTimestamp) // order by latest
            .FirstOrDefault(); // get latest
        var invoke = Expression.Invoke(expression, context.GetInstance());
        context.PushInstance(invoke);
        action = SyntaxVisitor.Continue;
        return true;
    }
}

// in Startup.ConfigureServices
services
.AddGraphQLServer()
.AddFiltering<CustomFilteringConvention>()

Our implementation is not the most generic solution but I think it can be used as a guide on how to implement a custom list operation / aggregation. Hope this helps! Cheers! 🍺

nice try but I think that we should create aggregation fields dynamically and then resolve, for example if we have a field entitled to "Age" we should dynamically create fields "Age_Sum", "Age_Count", "Age_Avg" and etc. and then resolve each of them. just like "totalCount" which added to schema fields dynamically by HotChocolate itself.

@LordLyng
Copy link

LordLyng commented Apr 16, 2021

I've been following this with great interest as i have a similar problem where I need complex aggregation in my query. I'm still not sure what i want is even possible but after seeing @gojanpaolo 's solution to a specific problem i got hopeful.

I gave it a shot but failed miserably - i know this isn't the place to ask "stupid" questions so I created a Stack Overflow post (https://stackoverflow.com/questions/67120898/need-help-creating-sum-aggregation-operation-in-hotchocolate-filters) would be much appreciated if anyone had the time to give it a look.

@symeus-sgs
Copy link

@PascalSenn , @michaelstaib Hi guys, I appreciate the feature won't be available till Nov+, has there been any progress on the blog post mentioned, to give us the ability to develop an interim solution?

Cheers,

@tobias-tengler tobias-tengler added the Area: Data Issue is related to filtering, sorting, pagination or projections label Jun 26, 2021
@ademchenko
Copy link
Contributor

ademchenko commented Aug 23, 2021

Agree with @symeus-sgs, could you @PascalSenn , @michaelstaib, at least write a blog post, please? That is so typical case needed in almost every project, I think.

@chungonion
Copy link

Hi @ademchenko and @symeus-sgs and @LordLyng. I also came across the issue on aggregation fields, where I would like to count the total entries of a field (not just the numbers of entries returning in pagination). I am not sure whether my scenario is the same as your scenario, but at least I managed to get the aggregated field response that fit into my use case.

I will use the example at graphql-workshop fluent type configurations for explanation.

in Configure method, you may resolve a "custom field" with

descriptor  
      .Field("sessionsCount")  
      .ResolveWith<SpeakerResolvers>(t => t.GetSessionsCountAsync(default!, default!, default!, default))  
      .UseDbContext<ApplicationDbContext>();

and in the SpeakerResolvers class, add the GetSessionsCountAsync method

public async Task<IEnumerable<int>> GetSessionsCountAsync(
    Speaker speaker,
    [ScopedService] ApplicationDbContext dbContext,
    SessionByIdDataLoader sessionById,
    CancellationToken cancellationToken)
{
    int[] sessionIds = await dbContext.Speakers
        .Where(s => s.Id == speaker.Id)
        .Include(s => s.SessionSpeakers)
        .SelectMany(s => s.SessionSpeakers.Select(t => t.SessionId))
        .ToArrayAsync();

    return sessionsIds.Length; //Side note here, you can mess up with anykinds of aggreation you want
}

viola! the aggregated value will now be returned in the graphQL response.

@michaelstaib
Copy link
Member Author

We are introducing a new projection engine with V13 that will open up the engine to do proper aggregations. We are still experimenting but it looks promising.

@ademchenko
Copy link
Contributor

ademchenko commented Sep 17, 2021

@chungonion , the issue is not about including into response some custom field which could be an aggregation field or any other one. The issue is to build the aggregation over the filtered ([UseFiltering]) collection- so, to reuse the actual filter to build the aggregated values. @michaelstaib could you or your teammates, please, explain how to do that in version 12, because version 13 is too far for us.

@PascalSenn
Copy link
Member

@ademchenko

using System;
using System.Collections.Generic;
using System.Linq;
using HotChocolate;
using HotChocolate.Data;
using HotChocolate.Data.Filters.Expressions;
using HotChocolate.Data.Sorting.Expressions;
using HotChocolate.Resolvers;
using HotChocolate.Types;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;

namespace Test
{
    public class Startup
    {
        // This method gets called by the runtime. Use this method to add services to the container.
        // For more information on how to configure your application, visit https://go.microsoft.com/fwlink/?LinkID=398940
        public void ConfigureServices(IServiceCollection services)
        {
            services
                .AddRouting()
                .AddGraphQLServer()
                .AddQueryType<Query>()
                .AddTypeExtension<FooConntectionExtensions>()
                .AddFiltering()
                .AddSorting()
                .AddProjections();
        }

        // This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
        public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
        {
            if (env.IsDevelopment())
            {
                app.UseDeveloperExceptionPage();
            }

            app.UseWebSockets();
            app.UseRouting();

            app.UseEndpoints(endpoints =>
            {
                // By default the GraphQL server is mapped to /graphql
                // This route also provides you with our GraphQL IDE. In order to configure the
                // the GraphQL IDE use endpoints.MapGraphQL().WithToolOptions(...).
                endpoints.MapGraphQL();
            });
        }
    }

    [ExtendObjectType(WellKnownTypeNames.FooConnection)]
    public class FooConntectionExtensions
    {
        public int TotalOfNumbers([ScopedState(WellKnownContextData.Data)] IEnumerable<Foo> foo) =>
            foo.Sum(x => x.Number);

        public int MaxOfNumbers([ScopedState(WellKnownContextData.Data)] IEnumerable<Foo> foo) =>
            foo.Max(x => x.Number);

        public int MinOfNumbers([ScopedState(WellKnownContextData.Data)] IEnumerable<Foo> foo) =>
            foo.Min(x => x.Number);
    }

    public static class WellKnownContextData
    {
        public const string Data = nameof(Data);
    }

    public static class WellKnownTypeNames
    {
        public const string FooConnection = nameof(FooConnection);
    }

    public class Query
    {
        [UsePaging(ConnectionName = "Foo")]
        [UseProjection]
        [UseFiltering]
        [UseSorting]
        public IEnumerable<Foo> GetFoos(IResolverContext context)
        {
            IEnumerable<Foo> result = _data.Filter(context).Sort(context);

            context.ScopedContextData =
                context.ScopedContextData.SetItem(WellKnownContextData.Data, result);

            return result;
        }

        private static IEnumerable<Foo> _data = new Foo[]
        {
            new(Guid.NewGuid(), 1, "A"), new(Guid.NewGuid(), 2, "A"),
            new(Guid.NewGuid(), 4, "A"), new(Guid.NewGuid(), 3, "A")
        };
    }

    public record Foo
    {
        public Foo()
        {
        }

        public Foo(Guid id, int number, string name)
        {
            Id = id;
            Name = name;
            Number = number;
        }

        public Guid Id { get; init; }
        public int Number { get; init; }
        public string Name { get; init; }
    }
}

@ademchenko
Copy link
Contributor

@PascalSenn does the same approach work for OffsetPaging?

@MaheshB0ngani
Copy link

@tjolr this feature can be easily done on your own with type extensions or type interceptors.

We have put aggregates towards the end of this year since there is really not a so big need to have this in the sense that it is easy to build this on your own.

However, we will do it and have scheduled it for the November release. No promises, it might be moved to an earlier or later release. We in general look at feedback of the community to determine the need of a feature.

What we could however do is write up a blog post outlining how one could implement this. @PascalSenn what do you think?

Waiting for this feature, or the blog post explaining how to do this on our end.

@michaelstaib
Copy link
Member Author

michaelstaib commented Nov 22, 2021

@MaheshB0ngani look at the code examples pascal posted.

#924 (comment)

@michaelstaib michaelstaib modified the milestones: HC-2021-12, Backlog Nov 22, 2021
@tobias-tengler
Copy link
Collaborator

How you can add aggregation fields to the Connection type is also documented here.

@ademchenko
Copy link
Contributor

@MaheshB0ngani which API for filtering do you use?
The new one or the old one?

@MaheshB0ngani
Copy link

Hi @michaelstaib Thank you for your quick response.

I was expecting a feature like group by on the fields the type and which has to be executed on the Database using Entity Framework and IQueryable.

 public IQueryable<User> GetUsers([Service] DbContext repository)
        => repository.Users; // many other tables are involved here using some joins.

User model may look like

User{
  Name,
  FavouriteFood,
  FavSport,
  FavTeam,
  State,
  Country,
}

I know the example I used might be silly. However, my original requirement is exactly similar.
I need a list with the following criteria (after lot more joins with other tables and some other logic)

SELECT FavTeam, COUNT(*) FROM Users
GROUP BY FavTeam

SELECT Country, FavTeam, COUNT(*) FROM Users
GROUP BY Country, FavTeam


SELECT Country, FavSport, COUNT(*) FROM Users
GROUP BY Country, FavSport


SELECT State, FavSport, COUNT(*) FROM Users
GROUP BY State, FavSport

and lot more, fully customizable from our UI. We need to be as generic as possible. I love the way HotChocolate will do filtering, Sorting and Pagination on top of EF & IQueryable. And also expected HotChocolate would have something out of the box for us to do Aggregations (Dynamic Group by) in very simple manner.

Note: I'm using Entity Framework Core to query the database and expecting the data to be queries on the Database and not in-memory after fetching into .Net application.
Also, there are other models apart of Users, so the the number of grouping combinations are very huge.

Please guide me with an example of how to implement these kind of dynamic grouping things.

@VarunSaiTeja
Copy link
Contributor

Any update on this? When we can expect this?
@PascalSenn

@ChilliCream ChilliCream locked and limited conversation to collaborators Mar 14, 2022
@PascalSenn PascalSenn converted this issue into discussion #4843 Mar 14, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Area: Data Issue is related to filtering, sorting, pagination or projections 🎉 enhancement New feature or request 🌶️ hot chocolate
Projects
None yet
Development

No branches or pull requests