Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protection against malicious queries #907

Open
Arfey opened this issue Feb 21, 2019 · 19 comments
Open

Protection against malicious queries #907

Arfey opened this issue Feb 21, 2019 · 19 comments

Comments

@Arfey
Copy link

@Arfey Arfey commented Feb 21, 2019

Hey. I was looking for a lot of information on how to protect against malicious requests, and as a result I found several common approaches:

All this approaches we can implement by meddleware and custom backend. But it will be cool if this solution is out of the box.

Also, graphene don't have information about security into the documentation.

ps: I can try to help, but if u have no reason why this is a bad idea.

@jmichalicek

This comment has been minimized.

Copy link

@jmichalicek jmichalicek commented Mar 7, 2019

I have recently been looking into exactly these same things. It would be great to have something built in available, and if not, at least some examples of how someone might best implement these with graphene.

@jkimbo

This comment has been minimized.

Copy link
Member

@jkimbo jkimbo commented Mar 16, 2019

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

@jkimbo jkimbo changed the title The native security api from malicious queries Protection against malicious queries Mar 16, 2019
@mvanlonden

This comment has been minimized.

Copy link
Member

@mvanlonden mvanlonden commented Mar 19, 2019

@jkimbo @Arfey Solution 4 sounds good. Easy to implement and will address this vulnerability in a timely manner. I don't see a reason why we would need introspection in production. @Arfey mind submitting a PR with this approach?

@Arfey

This comment has been minimized.

Copy link
Author

@Arfey Arfey commented Mar 19, 2019

yes, i will make pull request some time later

@etandel

This comment has been minimized.

Copy link
Contributor

@etandel etandel commented Mar 22, 2019

Just my completely unasked for $.02:

About 3, I think it would help if the types that are automatically generated (from SQLAlchemy or Django ORM models) had their fields explicitly included instead of explicitly excluded. Much like how Django's own ModeField works.

Instead of

class MyType(SQLAlchemyObjectType):
    class Meta:
        model = MyModel
        exclude_fields = ('scary_field',)

this:

class MyType(SQLAlchemyObjectType):
    class Meta:
        model = MyModel
        include_fields = ('safe_field',)

while making the absence of both include_fields or exclude_fields an error.

@ktosiek

This comment has been minimized.

Copy link

@ktosiek ktosiek commented Apr 17, 2019

Hiding introspection won't help much if the attacker has access to a copy of a client application, as she would be able to guess enough of the schema to write custom queries - just one cycle in the graph is enough for an attack.
But it would be useful for hiding upcoming features from competitors :-)

@Arfey

This comment has been minimized.

Copy link
Author

@Arfey Arfey commented Apr 17, 2019

I use middleware for that right now

class HideIntrospectMiddleware:
    """
    This middleware should use for production mode. This class hide the
    introspection.
    """
    def resolve(self, next, root, info, **args):
        if info.field_name == '__schema':
            return None
        return next(root, info, **args)
@sandwichsudo

This comment has been minimized.

Copy link

@sandwichsudo sandwichsudo commented Jun 4, 2019

@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:

1 and 2: query cost or resource limitations + limiting query depth

In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.

I think there is opportunity for experimentation in user space for this.

3. query whitelisting

The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.

4. hide introspection for production mode

This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?

Hiya, thanks for the sample code but how can I plug this into graphene please? Is it a setting I can add, something like:

GRAPHENE = {
    "SCHEMA": "path.to.my.schema",
    "BACKENDS": ["path.to.my.backends.DepthAnalysisBackend"],
}

?

I've also tried adding it to my GraphQL view, which gives me an error when I run a query through graphiql (using the core backend)

graphql_views.GraphQLView.as_view(graphiql=True, backend=GraphQLCoreBackend)

the error is:

document_from_string() missing 1 required positional argument: 'document_string'

Update - sorted it out, needed to provide an instance not the class. I'll leave this here in case it helps someone.

so to add the backend, I've done:

from .graphql import backends
....
urlpatterns = [
    path(
          "v1/graphql",
            graphql_views.GraphQLView.as_view(
                graphiql=True, backend=backends.DepthAnalysisBackend()
            )
     )
]
```
@thejcannon

This comment has been minimized.

Copy link

@thejcannon thejcannon commented Jun 5, 2019

My $.02 for how I handle this:

I have different types in the Schema with different related-node-information based on where the type is being used. This artificially sets a query depth which is dependent on the query.

E.g. If authors have one more books:

  • Author has field books of type AuthorsBook
  • AuthorsBook doesn't have field author, as that would create a cycle

This works for small to medium sized applications and doesn't take much thought/code to get working.

@sandwichsudo

This comment has been minimized.

Copy link

@sandwichsudo sandwichsudo commented Jun 6, 2019

Thanks @thejcannon - agreed, it makes a lot of sense just design the api not to have cycles in it. Still nice to know no one is able to run queries that join many distinct tables on my server.

@jkimbo - can I check something on the code sample you provided please:

...
# We are only interested in queries
            if definition.operation != 'query':
                continue
...

This suggests we're only interested in queries, but since a mutation could return an object (like an authentication mutation might return a user object) does it not make sense to run this on both? Wondering if there is some subtlety I'm missing.
Thanks

@jkimbo

This comment has been minimized.

Copy link
Member

@jkimbo jkimbo commented Jun 6, 2019

@sandwichsudo that is a good point, I can't remember why I added that line but I think you could probably just omit it and it would still work.

@stale

This comment has been minimized.

Copy link

@stale stale bot commented Aug 5, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Aug 5, 2019
@sboisson

This comment has been minimized.

Copy link

@sboisson sboisson commented Aug 16, 2019

I am using this validation rule to disable schema introspection:

class NoIntrospection(ValidationRule):
    def enter_Field(self, node, key, parent, path, ancestors):
        field_name = node.name.value
        if field_name == "__schema" or field_name == "__type":
            self.context.report_error(
                GraphQLError(u"GraphQL introspection is not allowed", [node])
            )
@stale stale bot removed the wontfix label Aug 16, 2019
@FWirtz

This comment has been minimized.

Copy link

@FWirtz FWirtz commented Sep 6, 2019

@sboisson I found this snippet:

export const introspectionTypes = Object.freeze([
  __Schema,
  __Directive,
  __DirectiveLocation,
  __Type,
  __Field,
  __InputValue,
  __EnumValue,
  __TypeKind,
]);

Seems like there is more introspection keywords - I think you should be checking for those too.
I think it might be easier to just check for startsWith("__").


On a more general note: More security tools for this would be nice.
Either included or as attachable middleware.

Adding to the initial list:

  • Input Validation (e.g. is the string I'm getting actually a string? Also, sanitise it.).
@sboisson

This comment has been minimized.

Copy link

@sboisson sboisson commented Sep 10, 2019

@FWirtz Nice find!
But checking just for startsWith("__") might be too much.
It seems that__typename is used a lot, by Apollo client for example.

@FWirtz

This comment has been minimized.

Copy link

@FWirtz FWirtz commented Sep 13, 2019

@sboisson yeah I found that catch afterwards as well haha.
I think it would be best to rather check the query name (i.e. root node) whether it starts with __ instead of the fields. Afaik __typename is always just a nested field. Not entirely sure how to do this though.

@cglacet

This comment has been minimized.

Copy link

@cglacet cglacet commented Dec 3, 2019

Any update on this matter? I found this repo of someone trying to work on this, for now it implements the least interesting part of it (limiting query depth). I'm really new to this so I don't know if this could help.

By the way, is it possible to add a middleware directly to a schema instead of having it plugged to schema.execute, I ask this because in some cases (for example when using Graphene with Starlette), we don't have access to schema.execute as it is performed internally. I opened an issue on that question on Starlette side, not sure what's the best solution here.

@melvinkcx

This comment has been minimized.

Copy link

@melvinkcx melvinkcx commented Dec 23, 2019

Can anyone enlighten me why schema introspection is a flaw?

I use graphene-django, and I only expose non-private fields using include_fields. For us, our React frontend dynamically generates Forms using schema introspection.

@ktosiek

This comment has been minimized.

Copy link

@ktosiek ktosiek commented Dec 23, 2019

@melvinkcx it's not strictly a flaw, it's a risk:

  1. for security, by exposing fields not used by your client applications, fields that might be less tested or even might only be exposed by accident;
  2. for business, by exposing new fields (often with some documentation) before the feature that uses them is released (or even finished).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.