Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to filter and limit? #27

Closed
aminghadersohi opened this issue Dec 20, 2016 · 33 comments
Closed

How to filter and limit? #27

aminghadersohi opened this issue Dec 20, 2016 · 33 comments

Comments

@aminghadersohi
Copy link

Hey really nice library. Does it support filtering and limit/offset?

@ornj
Copy link

ornj commented Dec 27, 2016

If you use SQLAlchemyConnectionField it subclasses graphene.relay.connection.ConnectionField which gives you the before, after, first, last keyword arguments. You can use first to limit and after to paginate.

You can extrapolate how this works from https://facebook.github.io/relay/docs/graphql-connections.html#content

@aminghadersohi
Copy link
Author

awesome! thanks. It doesn't look like it's actually passing the params to SQLAlchemy, and rather filter/limiting the results after they are fetched from the db. Looking at graphene_sqlalchemy/fields.py its doing the full query and then calling connection_from_list_slice that then slices it given the params. Is there a particular reason that the query itself is not offset and limited? Or am I misunderstanding. Thank you.

@ornj
Copy link

ornj commented Dec 29, 2016

Is it? Looking at my logs I do see LIMIT %s, %s in the SELECT statement when using first, after. You will see a query to get a count of the number of total records so that it can provide additional paging information.

@aminghadersohi
Copy link
Author

aminghadersohi commented Dec 30, 2016

oh! you are right. I was subclassing SQLAlchemyConnectionField so that I could add the same parameter for all my objects and that was causing the problem.

Here was the culprit code. As you can see I was trying to provide a way to have filter by uuid for all my objects, but obviously its not correct.

class MySQLAlchemyConnectionField(SQLAlchemyConnectionField):
    def __init__(self, type, *args, **kwargs):
        super(
            MySQLAlchemyConnectionField,
            self).__init__(type, uuid=String(), *args, **kwargs)

    @classmethod
    def get_query(cls, model, context, info, args):
        query = super(MySQLAlchemyConnectionField, cls).get_query(model,
                                                                      context,
                                                                      info,
                                                                      args)
        if args:
            for attr, value in args.items():
                query = query.filter(getattr(model, attr).like(value))
        return query

@aminghadersohi
Copy link
Author

Any suggestions on how I can query my objects by uuid?

@aminghadersohi
Copy link
Author

Right now I am thinking that I have to add my own resolve function to the query. Is that right?

@aminghadersohi
Copy link
Author

aminghadersohi commented Dec 30, 2016

I was able to do it this way:

Base = sqlalchemy.ext.declarative.declarative_base()

class ClientRoleStore(Base):
    __tablename__ = "clientrole"
    uuid = Column(CHAR(36), primary_key=True, nullable=False, default=lambda: str(uuid1()))

class ClientRole(SQLAlchemyObjectType):
    class Meta:
        model = ClientRoleStore
        interfaces = (relay.Node,)

all_types = [
    ClientRole,
    # there are other fields but I have kept it simple here
]

query_attrs = {'node': relay.Node.Field()}

def resolve(model, self, args, context, info):
    session = get_session(context)
    query = session.query(model)
    if args and 'uuid' in args:
        query = query.filter(getattr(model, 'uuid') == args['uuid'])
    return query

for type_cls in all_types:
    model = type_cls.__dict__['_meta'].model
    name = model.__tablename__
    query_attrs[name + 's'] = SQLAlchemyConnectionField(type_cls, uuid=String())
    query_attrs[name] = graphene.Field(type_cls)
    query_attrs['resolve_' + name + 's'] = partial(resolve, model)

Query = type("Query", (graphene.ObjectType,), query_attrs)

Now I can do this:

{
  clientroles (uuid: "060be6a1-ccc9-11e6-bcd7-f45c89ca5a3d") {
    edges {
      node {
        uuid
      }
    }
  } 
}

that returns:

{
  "data": {
    "clientroles": {
      "edges": [
        {
          "node": {
            "uuid": "060be6a1-ccc9-11e6-bcd7-f45c89ca5a3d"
          }
        }
      ]
    }
  }
}

And do this:

{
  clientroles (first:2) {
    edges {
      node {
        uuid
      }
    }
  } 
}

that returns

{
  "data": {
    "clientroles": {
      "edges": [
        {
          "node": {
            "uuid": "060be6a1-ccc9-11e6-bcd7-f45c89ca5a3d"
          }
        },
        {
          "node": {
            "uuid": "0adffea8-cd6b-11e6-970e-f45c89ca5a3d"
          }
        }
      ]
    }
  }
}

Also I think I can extend SQLAlchemyConnectionField with a metaclass approach to provide filtering on any field! This is awesome!! Please let me know if this is a good approach.

@dspoja
Copy link

dspoja commented Jan 17, 2017

I am new to Graphene and GraphQL, and have been trying to figure out how to query by a single argument with graphene-sqlalchemy. Is the above solution the only way to query by argument? Does one have to use Relay with Graphene? I would prefer not to use Relay.
Thanks!

Update: I figured it out. I had to add an argument to the object that is being returned by the query and that way the resolver would be able to pick it up:
my_objects = graphene.List(MyObject, userId=graphene.Int())

def resolve_my_objects_by_user(self, args, context, info):
query =MyObject.get_query(context)
print(str(args))
user_id = args.get('userId')
print(user_id)
return query.filter(MyObjectModel.user_id == user_id)

@bochuxt
Copy link

bochuxt commented May 19, 2017

dosInCoworkerbyId = graphene.List(of_type=DoInCoworker,coworker_id=graphene.Int(),maestro_id=graphene.Int())#SQLAlchemyConnectionField(DoInCoworker)
def resolve_dosInCoworkerbyId(self,args,context,info):
    querydosInCoworker = DoInCoworker.get_query(context)
    if args=={}:
        return querydosInCoworker.all()
    elif args.get('coworker_id') != None and args.get('maestro_id') == None:
        # print(" +++ ",args.get('creator_id'),args.get('registered'))
        return  querydosInCoworker.filter_by(hubot_id=args.get('coworker_id')).all()
    else:
        return  querydosInCoworker.filter_by(maestro_id=args.get('maestro_id')).all()

@athal7
Copy link

athal7 commented Jul 20, 2017

Here's something I threw together that still supports the default relay params, and also allows filtering by any model attribute, including association_proxy:

class MyConnectionField(graphene_sqlalchemy.SQLAlchemyConnectionField):
    RELAY_ARGS = ['first', 'last', 'before', 'after']

    @classmethod
    def get_query(cls, model, context, info, args):
        query = super(MyConnectionField, cls).get_query(model, context, info, args)
        for field, value in args.items():
            if field not in cls.RELAY_ARGS:
                query = query.filter(getattr(model, field) == value)
        return query

class Query(graphene.ObjectType):
    node = graphene.relay.Node.Field()
    all_somethings = MyConnectionField(SomethingNode, field_name=grahene.String())   

@lnunno
Copy link

lnunno commented Aug 15, 2017

@aminghadersohi Pretty sure your example is what I need, but it's difficult to wrap my head around with only a beginning knowledge of GraphQL and the Python supporting libraries!

Honestly, I think this sort of functionality should be built into graphene-sqlalchemy or at least be a configuration option when instantiating the Metaclass. As it stands this library is really only useful for non-parameterized queries.

@kavink
Copy link

kavink commented Nov 3, 2017

@athal7 Looks like this does not work in latest releases with Graphene 2.0, downgrading it works. Any thoughts on how to fix it? Looking at upgrading to 2.0 docs nothing stood out.

@athal7
Copy link

athal7 commented Nov 3, 2017

@kavink yes just a few changes:

class MyConnectionField(graphene_sqlalchemy.SQLAlchemyConnectionField):
    RELAY_ARGS = ['first', 'last', 'before', 'after']

    @classmethod
-   def get_query(cls, model, context, info, args):
-       query = super(MyConnectionField, cls).get_query(model, context, info, args)
+   def get_query(cls, model, info, **args):
+       query = super(MyConnectionField, cls).get_query(model, info, **args)
        for field, value in args.items():
            if field not in cls.RELAY_ARGS:
                query = query.filter(getattr(model, field) == value)
        return query

class Query(graphene.ObjectType):
    node = graphene.relay.Node.Field()
    all_somethings = MyConnectionField(SomethingNode, field_name=grahene.String()) 

@gonardfreeman
Copy link

Here if someone needs exact and like query filtering:
`
CUSTOM_ARGS = ['contain', 'exact']
@classmethod
def get_query(cls, model, info, **args):
query = super(CustomConnectionField, cls).get_query(model, info, **args)
for field, value in args.items():
if field not in cls.RELAY_ARGS:
for arg in cls.CUSTOM_ARGS:
if arg in field:
query = query.filter(
getattr(model, field.replace('_' + arg, '')).like('%{}%'.format(value))
)
return query

            query = query.filter(
                getattr(model, field) == value
            )
    return query`

And in Query class:
all_branches = CustomConnectionField( BranchCAT, ID=String(), branch_name_ua=String(), branch_name_ua_contain=String(), address_ua_contain=String() )

@kavink
Copy link

kavink commented Nov 6, 2017

@athal7 Thanks a lot I did do that, But it did not work for me, Mainly i think because i have resolve_ out of the class . posted by question as new one on #85 Any thoughts on it ?

@teruokun
Copy link

So some of this works, but doesn't expose the right schema. Here was my solution:

class InstrumentedQuery(SQLAlchemyConnectionField):
    def __init__(self, type, *pos_args, **kwargs):
        self.query_args = {}
        kwargs['sort_by'] = String()

        for k, v in type._meta.fields.items():
            if isinstance(v, Field):
                field_type = v.type
                if isinstance(field_type, NonNull):
                    field_type = field_type.of_type
                self.query_args[k] = field_type()
        print(self.query_args)
        super().__init__(type, *pos_args, **kwargs, **self.query_args)

    def get_query(self, model, info, **args):
        query_filters = {k: v for k, v in args.items() if k in self.query_args}
        query = model.query.filter_by(**query_filters)
        if 'sort_by' in args:
            query.order_by(getattr(model, to_snake_case(args['sort_by'])))
        return query

    def connection_resolver(self, resolver, connection, model, root, info, **args):
        query = resolver(root, info, **args) or self.get_query(model, info, **args)
        count = query.count()
        connection = connection_from_list_slice(
            query,
            args,
            slice_start=0,
            list_length=count,
            list_slice_length=count,
            connection_type=connection,
            pageinfo_type=PageInfo,
            edge_type=connection.Edge,
        )
        connection.iterable = query
        connection.length = count
        return connection

@teruokun
Copy link

Sorry, after testing I realized my solution was a little more incomplete than I intended and didn't preserve certain features. Here's a tested version.

Note: sort_by should likely be a list of objects with field and direction keys where field is an enum generated from the field names and direction is an enum of 'asc', 'desc', but I didn't have time to fully put that in

from graphene_sqlalchemy import SQLAlchemyConnectionField
from graphene import Field, String, NonNull, List
from graphene.utils.str_converters import to_snake_case
from graphene.relay.connection import PageInfo
from graphql_relay.connection.arrayconnection import connection_from_list_slice
from sqlalchemy import desc, asc


ORDER_FUNCTIONS = {'asc': asc, 'desc': desc}


class InstrumentedQuery(SQLAlchemyConnectionField):
    def __init__(self, type, **kwargs):
        self.query_args = {}
        for k, v in type._meta.fields.items():
            if isinstance(v, Field):
                field_type = v.type
                if isinstance(field_type, NonNull):
                    field_type = field_type.of_type
                self.query_args[k] = field_type()
        args = kwargs.pop('args', dict())
        args.update(self.query_args)
        args['sort_by'] = List(String, required=False)
        super().__init__(type, args=args, **kwargs)

    def get_query(self, model, info, **args):
        query_filters = {k: v for k, v in args.items() if k in self.query_args}
        query = model.query.filter_by(**query_filters)
        if 'sort_by' in args:
            criteria = [self.get_order_by_criterion(model, *arg.split(' ')) for arg in args['sort_by']]
            query = query.order_by(*criteria)
        return query

    def connection_resolver(self, resolver, connection, model, root, info, **args):
        query = resolver(root, info, **args) or self.get_query(model, info, **args)
        count = query.count()
        connection = connection_from_list_slice(
            query,
            args,
            slice_start=0,
            list_length=count,
            list_slice_length=count,
            connection_type=connection,
            pageinfo_type=PageInfo,
            edge_type=connection.Edge,
        )
        connection.iterable = query
        connection.length = count
        return connection

    @staticmethod
    def get_order_by_criterion(model, name, direction='asc'):
        return ORDER_FUNCTIONS[direction.lower()](getattr(model, to_snake_case(name)))

@kavink
Copy link

kavink commented Feb 5, 2018

@teruokun seems like your solution only does sort_by but looks like it does not allow someone to filter tables based on any columns which this issue seems to be about ?

Also is it possible to also add usage of your class and an example of what user should expect when using ?

@teruokun
Copy link

teruokun commented Feb 5, 2018

Actually it does work for column-based data by analyzing the fields in the meta model and not only allows you to send them but also enforces correct typing (based on the output field types) as well as inclusion in the schema as arguments for the related query. It doesn't propagate to connections referenced through relationships, which is ideally what I'd want, but I've verified and tested that it does actually get evidenced in the schema properly. Basically, my solution works by creating a query that works for all fields, not just the ones you would specify as specific arguments (which may be too much magic, I grant you).

Usage is the same way you'd use SQLAlchemyConnectionField to specify a query:

class Query(graphene.ObjectType):
    node = graphene.relay.Node.Field()
    all_somethings = InstrumentedQuery(SomethingNode) 

@kavink
Copy link

kavink commented Feb 19, 2018

@teruokun Thanks , Your work has been very helpful to me. One other issue i'm trying to resolve is .

class SomethingElseNode(db.Model):
   __tablename__ = 'something_else'
   id = Column(Integer, primary_key=True)
   ....
   something_id = Column(Integer, ForeignKey('something id'), nullable=False)

class SomethingNode(db.Model):
   __tablename__ = 'something'
   id = Column(Integer, primary_key=True)
   ....
   all_something_else = relationship(SomethingElseNode, backref="something", cascade="all,delete")



class Query(graphene.ObjectType):
    node = graphene.relay.Node.Field()
    something = graphene.relay.Node.Field(SomethingNode)
    all_somethings = InstrumentedQuery(SomethingNode) 
    something_else = graphene.relay.Node.Field(SomethingElseNode)
    all_something_else = InstrumentedQuery(SomethingElseNode) 

For SomethingNode has a relationship with SomethingElseNode , I am able to query any fields of SomethingNode and SomethingElseNode from all_* in addition to first, last, before, sort_by , etc..

But when going from all_somethings to all_something_else, I can only filter by SQLAlchemyConnectionField filters i.e. first, last, before, sort_by , etc. , but not buy columns of SomethingElseNode

Any thoughts on how can make InstrumentedQuery available for relationships too ? So I can filter by columns of SomethingElseNode

@teruokun
Copy link

teruokun commented Mar 6, 2018

Yeah, I was looking into that myself and couldn't figure out exactly how to get it to work. I think I'd need some additional help troubleshooting from one of the core developers because I was trying to register it as the default connection field factory using the graphene_sqlalchemy.fields.registerConnectionFieldFactory method and couldn't get it to work correctly, though that may be because of odd interactions with Relay plugins (as most of my types need the Node interface as well).

@jrazmi
Copy link

jrazmi commented Mar 29, 2018

NOTE: not fully tested at all. I'll update the code here if i find anything terribly wrong with it.
I built out a somewhat dynamic version of this for filtering if anyone is interested, or has better ideas/edits let me know.

Essentially it allows you add a JSON field type argument "filters" to a custom connection field.
Filters takes a list. For each item in the filter list you can pass it an dict with the key,op,val (AND query), or you can pass it a list of dicts with key,op,val(OR query).

example OR gql query:

query {
  users(filters:[[{key: "first_name", op:"starts", val:"Josh"},{key: "first_name", op:"==", val:"Fake"}]], first: 10) {
    edges {
      node {
        id
        firstName
        email
        lastName
      }
    }
    pageInfo {
      startCursor
      endCursor
    }
  }
}

Dynamic Filtering Connection Field:

from graphene_sqlalchemy import SQLAlchemyConnectionField
from sqlalchemy.sql import and_, func, or_, not_
class CustomConnectionField(SQLAlchemyConnectionField):
    RELAY_ARGS = ['first', 'last', 'before', 'after']
    @classmethod
    def get_query(cls, model, info, **args):
        query = super(CustomConnectionField, cls).get_query(model, info, **args)
        if args.get("filters"):
            query = filterQuery(query, model, args["filters"])
        return query

def filterQuery(query, model, filters):
    for filter in filters:
        conditions = []
        if isinstance(filter, (list,)):
            for filt in filter:
                conditions = constructConditions(conditions,filt, model)
            condition = or_(*conditions)
            query = query.filter(condition)
        if isinstance(filter, (dict,)):
            conditions = constructConditions(conditions, filter, model)
            query = query.filter(*conditions)
    return query
def constructConditions(conditions, filter, model):
    c = getattr(model, filter.get("key"))
    v = filter.get('val')
    op = filter.get("op")
    if not c or not op or not v:
        pass
    if op == '==':
        conditions.append(c == v)
    if op == '!=':
        conditions.append(c != v)
    if op == "<=":
        conditions.append(c <= v)
    if op == ">=":
        conditions.append(c >= v)
    if op == ">":
        conditions.append(c > v)
    if op == "<":
        conditions.append(c < v)
    if op == 'starts':
        conditions.append(c.ilike(v+"%"))
    if op == 'ends':
        conditions.append(c.ilike("%"+v))
    if op == 'contains':
        conditions.append(c.contains(v))
    if op == 'in':
        conditions.append(c.in_(v))
    if op == 'notin':
        conditions.append(not_(c.in_(v)))
    return conditions

Example query/resolver:

class User(SQLAlchemyObjectType):
    class Meta:
        model = UserModel
        interfaces = (graphene.relay.Node,)
class Query(graphene.ObjectType):
    users = CustomConnectionField(User, filters=GenericScalar())



@kavink
Copy link

kavink commented Apr 4, 2018

Has anyone being able to create grapqhl schema for JSONType field ? i.e. my Table has a JSON column. Im using Postgres, So wondering if anyone got graphql to Postgres json queries working ? Any pointers ?

@kannansamp
Copy link

kannansamp commented Apr 8, 2018

How can I get the original ID value in the response instead of base64 encoded value?

{
"node": {
"id": "TWljcm9wb3N0czox"
}
}

Expecting value:

{
"node": {
"id": "3045"
}
}

@dafrie
Copy link

dafrie commented Sep 8, 2018

@interpegasus This is the graphene-sqlalchemy repository and not the graphene-django repo, so unfortunately not applicable...

@comtihon
Copy link

comtihon commented Sep 8, 2018

Hi all,
I've added PR #164 for filtering on scalar types.

@nav1885
Copy link

nav1885 commented Jun 2, 2019

I was able to do it this way:

Base = sqlalchemy.ext.declarative.declarative_base()

class ClientRoleStore(Base):
    __tablename__ = "clientrole"
    uuid = Column(CHAR(36), primary_key=True, nullable=False, default=lambda: str(uuid1()))

class ClientRole(SQLAlchemyObjectType):
    class Meta:
        model = ClientRoleStore
        interfaces = (relay.Node,)

all_types = [
    ClientRole,
    # there are other fields but I have kept it simple here
]

query_attrs = {'node': relay.Node.Field()}

def resolve(model, self, args, context, info):
    session = get_session(context)
    query = session.query(model)
    if args and 'uuid' in args:
        query = query.filter(getattr(model, 'uuid') == args['uuid'])
    return query

for type_cls in all_types:
    model = type_cls.__dict__['_meta'].model
    name = model.__tablename__
    query_attrs[name + 's'] = SQLAlchemyConnectionField(type_cls, uuid=String())
    query_attrs[name] = graphene.Field(type_cls)
    query_attrs['resolve_' + name + 's'] = partial(resolve, model)

Query = type("Query", (graphene.ObjectType,), query_attrs)

Now I can do this:

{
  clientroles (uuid: "060be6a1-ccc9-11e6-bcd7-f45c89ca5a3d") {
    edges {
      node {
        uuid
      }
    }
  } 
}

that returns:

{
  "data": {
    "clientroles": {
      "edges": [
        {
          "node": {
            "uuid": "060be6a1-ccc9-11e6-bcd7-f45c89ca5a3d"
          }
        }
      ]
    }
  }
}

And do this:

{
  clientroles (first:2) {
    edges {
      node {
        uuid
      }
    }
  } 
}

that returns

{
  "data": {
    "clientroles": {
      "edges": [
        {
          "node": {
            "uuid": "060be6a1-ccc9-11e6-bcd7-f45c89ca5a3d"
          }
        },
        {
          "node": {
            "uuid": "0adffea8-cd6b-11e6-970e-f45c89ca5a3d"
          }
        }
      ]
    }
  }
}

Also I think I can extend SQLAlchemyConnectionField with a metaclass approach to provide filtering on any field! This is awesome!! Please let me know if this is a good approach.

This works great for smaller datasets, but doesnt seem to scale well for large datasets (1M+rows)
From what i see, query is being executed across the entire db and then filter/limit is happening server side. Any pointers if im missing something obvious ?

What I'm trying to do is to expose a few tables via graphql so that its easier for end users to query by passing filters and and use pagination to handle resultset efficiently for large tables.

class LaunchDateModel(Base):
    __tablename__ = 'launch_date'
    __table_args__ = {'schema': 'some_schema'}

    id = Column(SmallInteger, primary_key=True)
    date = Column(Date)
    name = Column(String(10))


class LaunchDate(SQLAlchemyObjectType):
    class Meta:
        model = LaunchDateModel
        interfaces = (graphene.relay.Node,)

class LaunchDateConnection(graphene.relay.Connection):
    class Meta:
        node = LaunchDate
    count = graphene.Int()

    def resolve_count(self, info):
        return len(self.edges)

class LaunchDateQueries(graphene.ObjectType):

    class Meta:
        description = "Launch date queries"

    launch_dates = SQLAlchemyConnectionField(
        LaunchDate,
        first=graphene.Int(required=True),
        id=graphene.Int(),
        date=graphene.Date(),
        name=graphene.String()
     )

    def resolve_launch_dates(self, info, **kwargs):
        query = LaunchDate.get_query(info)
        for attr, value in kwargs.items():
            if hasattr(LaunchDateModel, attr):
                query = query.filter(getattr(LaunchDateModel, attr) == value)

        logger.info("Query: {}".format(query))
        return query.all()

@vaskokj
Copy link

vaskokj commented Jun 19, 2019

Is this still not built into the graphene-sqlalchemy library?

This comments solution is what I was able to get to work, but wasn't sure if there was a more appropriate/recommended solution that might be built in that I might be missing.

#27 (comment)

@art1415926535
Copy link

If someone like me is tired of waiting for filtering, then you can use graphene-sqlalchemy-filter.

@cglacet
Copy link

cglacet commented Feb 2, 2021

Is there a reason for not having an offset? Is using offset instead of after considered bad practice? (I'm asking because I see offsets being used everywhere in GraphQL examples).

@smentek
Copy link

smentek commented Mar 7, 2021

Is there a reason for not having an offset? Is using offset instead of after considered bad practice? (I'm asking because I see offsets being used everywhere in GraphQL examples).

As I understand the idea is to follow relay specification that has advantage of opaque cursors, where when using limit/offset you end up jumping over the data or having them twice in the successive responses.

@erikwrede
Copy link
Member

Closing all [duplicate] issues related to this. The discussion will continue in #347 (WIP). A proposal for the implementation of filters is currently being worked on and will be posted there once it is ready.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related topics referencing this issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests