Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Federation gateway authentication and authorization #343

Closed
adamovittorio opened this issue Nov 4, 2020 · 27 comments
Closed

Federation gateway authentication and authorization #343

adamovittorio opened this issue Nov 4, 2020 · 27 comments

Comments

@adamovittorio
Copy link

adamovittorio commented Nov 4, 2020

I would like to handle authentication and authorization per GraphQL operation at the Gateway level.

This is quite a complex feature, and I don't have a clear solution in mind; I would like to start the discussion for a possible implementation method.

General considerations

  • Pluggable RBAC method
  • The services are responsible for defining the RBAC at schema level to inform the Gateway
# Directives
directive @auth(
  requires: Role = USER,
) on OBJECT | FIELD_DEFINITION

enum Role {
  ADMIN
  USER
  UNKNOWN
}

# Service 1
extend Query {
  me: User @auth(role: USER)
  userById(id: ID!): User @auth(role: ADMIN)
}

type User @key(fields: "id") {
  id: ID!
  name: String!
}

# Service 2
type Posts @key(fields: "id") {
  id: ID!
  title: String!
  secretValue: String! @auth(role: ADMIN)
}

extend type User @key(fields: "id") {
  id: ID! @external
  posts: [Post]
}

Prerequisites

  • Implement authentication and authorization directives
  • Decorate service schemas with the authentication directives

Flow

  • The user sends a GraphQL operation with credentials
  • The Gateway verifies the user credentials, parses the user RBAC, and add them with the user credentials to the context
  • The Gateway parses the operation document and compare it with the schema definitions, the result of this step is to have an RBAC policy to compare with the user's one
  • The Gateway compares the policies and, in case of successful authorization, delegates the GraphQL execution to the underlying service
@mcollina
Copy link
Collaborator

mcollina commented Nov 4, 2020

I've been doing some research on the topic. Most of this revolves around https://www.openpolicyagent.org/ and https://casbin.org/.

If'd like to work on this, I'll be happy to create a new repo in the org... I think this one has way too many features already.

I think we might need to define how the auth token is modelled - or maybe you are already assuming that we'd use JWT. In the case, we developed https://github.com/nearform/fast-jwt for a very similar purpose - it's a similar technique to the one used in the AWS API Gateway.

@adamovittorio
Copy link
Author

adamovittorio commented Nov 4, 2020

I like the idea of having different auth strategies that the user can choose.

We could start with the most common-auth method to verify the user token and to decorate the request context with the credentials. Export the GraphQL directives to use in the schema definition, and as you suggested using something like open policy or casbin to define and check the RBAC.

All of that could live in the new auth repo, but regarding the user RBAC's comparison with the requested operation, I still haven't properly understood how/where to do it!?

I assume that we have to do it in this repo somewhere in the gateway resolvers creation or we need to have a hook system in place from where we can extend the gateway "query planner".

@mcollina
Copy link
Collaborator

mcollina commented Nov 7, 2020

I think you'll need to add some hooks to plugin into some of the internals to achieve this. It's fine and we should definitely add them when they are needed.

@hsluoyz
Copy link

hsluoyz commented Nov 7, 2020

Hi, I'm from Casbin team:) I noticed that Nearform has just released two plugins for Casbin based authorization. Does it help on our GraphQL scenario?

@adamovittorio
Copy link
Author

@mcollina should we start defining the events to emit in mercurius? You mentioned that all the hooks logic, it's really similar to the work already done for Fastify, and that we can reuse a lot from there.

@hsluoyz I think this is exactly what we should do here too, adapting it to the GraphQL operation instead that to a route based service.

cc @projectjudge @aleccool213

@mcollina
Copy link
Collaborator

mcollina commented Nov 7, 2020

That's a good approach!

@jonnydgreen jonnydgreen mentioned this issue Jan 21, 2021
6 tasks
@jonnydgreen
Copy link
Contributor

jonnydgreen commented Mar 23, 2021

Hi! I've been starting to look at this in detail based on @adamovittorio 's suggestion and have come up with an implementation suggestion for it. Let me know what you think!

The general approach

Register a plugin after Mercurius is registered with user defined checks and directives in options:

const schema = `
  type Query {
    add(x: Int, y: Int): Int
  }
`;
const resolvers = {
  Query: {
    add: async (_, { x, y }) => x + y,
  },
};
app = Fastify();
app.register(mercurius, { schema, resolvers });
app.register(mercuriusAuthPlugin, options);

Utilise the new Mercurius hooks to run the authn/authz checks:

  • authentication - preExecution
  • authorization - preGatewayExecution

This will be provided in the configuration at the same time as the auth strategy and should allow total flexibility. The plugin will use this Directive definition to look for identifiers in the service schema.

We can add a config option for Mercurius to load this plugin and define a custom auth strategy with associated SDL.

Screenshot 2021-03-23 at 12 21 57

Add Mercurius plugin

Using the fastify plugin framework, register the plugin as normal (but check that mercurius is enabled beforehand). Something along the lines of:

fp(async (fastify, options) => {
  if (typeof fastify.graphql !== 'function') {
    throw new Error('No graphql plugin registered.');
  }
  // Rest of plugin initialisation
});

Options and their usages

Get auth context

This will run in preExecution and use the provided function in the plugin options to construct a user RBAC and put it onto the context with the credentials.

Schema directive

The plugin will require a new “auth” Directive that allows one to define the policies required for a field. This option tells us what schema directive to look for in the destination schema for each service.

Note, we will need to define the schema directive but also associated types so they are valid when added to the service schema. Both gateway and services will need to define (at least) the directive.

Authz checks

This would happen in preGatewayExecution. For each service, compare the schema with the Query document, and identify fields (through the presence of Directive ASTs) to auth the user policies (obtained from the context) against.

Depending on the Directive, we will be able to run the defined auth strategy. When it runs:

  • Upon success, do nothing to the Query Document.
  • Upon failure, remove the field from the Query Document AST and add an error object to the GraphQL response with the appropriate metadata.

Once completed, execute the (adjusted in the case of failures) Query as normal and resolve the response as normal

Note, I think there is some complexity to be careful with from both an implementation and performance point of view in the schema comparison with the query document. In addition, for reference types, we will need to be careful here as well.

Alternative approaches and open questions

  • We could do everything in preExecution instead of preGatewayExecution. I don't know what the performance impact of each approach is but I like the idea of doing it on a per service basis (i.e. preGatewayExecution) - I think it could give us some extra flexibility
  • How would be best to get started? You mentioned about creating a new repo perhaps?

Let me know what you think! As mentioned, I'm very happy to put this together once everyone is agreed and if that works for everyone! :)

@mcollina
Copy link
Collaborator

I think we should also support authorizing local resolvers, not just remote servers.

Apart from that, go for it. Would you like me to create a new repo in this org?

@jonnydgreen
Copy link
Contributor

I think we should also support authorizing local resolvers, not just remote servers.

No problem, I'll make sure that's included!

Apart from that, go for it. Would you like me to create a new repo in this org?

Awesome, thanks! :) Yes please, if that's okay with you?

@jonnydgreen
Copy link
Contributor

Hey @mcollina, sorry for the bother - I just wondered if you'd had a chance to create a new repo for this in the org? Cheers! :)

@jonnydgreen
Copy link
Contributor

Thanks very much! :)

@valdestron
Copy link

@jonnydgreen

Your solution would be very nice.

How I am doing this is a little bit different. I am using custom authChecker middleware in federated services, it parses RBAC headers. Then field level @Authorized() directive calls the authChecker where the RBAC logic is defined.

But I am not sure how to pass RBAC headers to downstream services from the gateway. In apollo federation its possible to achieve it with context linking, not sure how to do it with mercurius.

@jonnydgreen
Copy link
Contributor

@valdestron thanks, I'm getting started on it asap :)

In Mercurius gateway and depending on how you are generating the RBAC headers, you can set the rewriteHeaders option which may help you in passing RBAC headers to federated services. I have also used it in combination with Mercurius hooks. For example, passing OpenTelemetry headers to federated services, and thus, enabling distributed tracing across GQL requests.

There is also this issue which may interest you. It proposes to add context to rewriteHeaders, which may also help you in the situation you described :)

@valdestron
Copy link

valdestron commented Apr 8, 2021

Yes I saw this issue with rewriteHeaders. I am having a problem of generating RBAC headers using rewriteHeaders as I need to access some remote Identity Provider like Auth0 and caching like Redis, before forwarding RBAC headers to downstream.

If your solution would be in place, there would be three auth options IMO using mercurius gateway:

  1. access context in rewriteHeaders, and pass RBAC headers, then use some AuthGuard/AuthChecker downstream (IF CONTEXT ISSUE WILL BE FIXED)
  2. pass authorization header to downstream and reimplement RBAC there, decoupled services looses purpose a little bit...
  3. central auth in gateway (IN PROGRESS) - would be best approach for federated/stitched services, I can not find any implementaitons of such central auth in graphql servers ecosystem

Seems that until context issue or your solution is done. We do not have any other way but to reimplement RBAC in each service.

@alex-parra
Copy link
Contributor

  1. access context in rewriteHeaders, and pass RBAC headers, then use some AuthGuard/AuthChecker downstream (IF CONTEXT ISSUE WILL BE FIXED)

As of release 7.4.0 (Apr 9th, 2021) context is now passed as a second argument to rewriteHeaders.

gateway: {
  services: [
    {
      name: '...',
      url: '...',
      rewriteHeaders: (headers, context) => {
        return { 'x-custom-header': 'custom-header-value' }
      },
    },
  ];
}

@dragonfriend0013
Copy link

@valdestron thanks, I'm getting started on it asap :)

In Mercurius gateway and depending on how you are generating the RBAC headers, you can set the rewriteHeaders option which may help you in passing RBAC headers to federated services. I have also used it in combination with Mercurius hooks. For example, passing OpenTelemetry headers to federated services, and thus, enabling distributed tracing across GQL requests.

There is also this issue which may interest you. It proposes to add context to rewriteHeaders, which may also help you in the situation you described :)

@jonnydgreen Do you have a small example of using OpenTelemetry with gateway and federated services? I am working on this right now and am having issues.

@jonnydgreen
Copy link
Contributor

jonnydgreen commented Jun 8, 2021

@dragonfriend0013 I don't have an example I can share unfortunately but I can point you in the right direction for sure and am happy to help get it working for you! :)

The Mercurius docs are a good starting point for basic OTEL tracing on a Mercurius server: https://mercurius.dev/#/docs/integrations/open-telemetry

Once you have this (or equivalent) up and running, if you wanted to get distributed tracing working in the gateway and federated services, you can use a combination of hooks, OTEL propagation API and rewrite headers. The idea being to:

  • Use the Propagation API to inject trace information into the headers in an appropriately chosen hook on the gateway
  • Make sure these populated headers are correctly passed to federated services using rewriteHeaders
  • Use the Propagation API to extract trace information on the federated services (I'm pretty sure this will be done automatically for you in the Mercurius docs example anyway)

@dragonfriend0013
Copy link

dragonfriend0013 commented Jun 9, 2021

@jonnydgreen I just copied the example from https://mercurius.dev/#/docs/integrations/open-telemetry and had to make a change in the tracer.js (HttpTraceContext undefined, changed to HttpTraceContextPropagator), Now i am getting getSpan is not a function in the serviceAdd.js example when running a query. This is coming from @autotelic/fastify-opentelemetry/index.js:109:18. I was getting this error in my own project code as well (which is why I was stuck yesterday). I assume something has changed since the original example was posted, but i am having trouble finding out what needs to be updated.
EDIT: Looks like I need to wait for @autotelic/fastify-opentelemetry@0.13.0 as it will have support for opentelemetry version 0.20.0

@jonnydgreen
Copy link
Contributor

Yeah, I was about to say it sounds like a version issue from the error you're describing - looks like 0.13.0 has been released as of 2 hours ago, let me know if that works for you: https://www.npmjs.com/package/@autotelic/fastify-opentelemetry?activeTab=readme

@smolinari
Copy link
Contributor

If authentication and authorization via SDL becomes a thing with Mercurius, will it be 100% optional?

Scott

@jonnydgreen
Copy link
Contributor

This is available as a plugin via mercurius-auth, so yes, 100% optional :)

@dragonfriend0013
Copy link

dragonfriend0013 commented Aug 17, 2021

I am using mercurius-auth on my different services. as per the docs i have added

directive @auth(requires: String!) on OBJECT | FIELD_DEFINITION

and have coded my specific code to do the authorization. this works per service. however when i use mercurius gateway to stitch the services together, i get duplicates of the @auth directive and this is causing the parsing to fail.

{
  "error": {
    "data": null,
    "errors": [
      {
        "message": "Error: There can be only one directive named \"@auth\".\n\nThere can be only one directive named \"@auth\".\n\nThere can be only one directive named \"@auth\".\n\nThere can be only one directive named \"@auth\"."
      }
    ]
  }
}

relevant part of composed gateway schema:

directive @external on FIELD_DEFINITION

directive @requires(fields: _FieldSet!) on FIELD_DEFINITION

directive @provides(fields: _FieldSet!) on FIELD_DEFINITION

directive @key(fields: _FieldSet!) on OBJECT | INTERFACE

directive @extends on OBJECT | INTERFACE

# Authorization directive.
directive @auth(requires: String!) on OBJECT | FIELD_DEFINITION

# Authorization directive.
directive @auth(requires: String!) on OBJECT | FIELD_DEFINITION

# Authorization directive.
directive @auth(requires: String!) on OBJECT | FIELD_DEFINITION

does anyone have any suggestions, or is this a bug in gateway mode?

@jonnydgreen
Copy link
Contributor

jonnydgreen commented Aug 17, 2021

does anyone have any suggestions, or is this a bug in gateway mode?

I've written a quick test and can confirm that it's an issue in gateway mode: https://github.com/jonnydgreen/mercurius/blob/bugfix/duplicate-directives/test/gateway/custom-directives.js

Test result:

Screenshot 2021-08-17 at 21 31 55

Schema:

Screenshot 2021-08-17 at 21 26 05

From the above, we can see that the service directives are duplicated in the gateway, which correlates with what you are seeing.

An initial fix would be to de-duplicate these directives when the schema is constructed; however I'd be concerned about how we might handle directives of the same name but different definition across services.

I think there are several options to account for this situation (differing directive definitions with the same name):

  • Allow users to turn directive de-duplication on/off in an option
  • Take the first directive
  • Error

As this isn't related to auth at the gateway level and rather, a generic issue for the gateway, I wonder if it's worth us creating a draft PR/issue and working through a solution there?

@mcollina
Copy link
Collaborator

I would recommend taking the first directive.

I think we might want to supoprt a few things:

  1. deduplicate if all the directive are the same (if possible)
  2. filter out directives that should not be present in the supergraphl

It is also ok to error if services could not be composed together.

@mcollina
Copy link
Collaborator

Could you open a PR?

@jonnydgreen
Copy link
Contributor

Could you open a PR?

Yeah no problem! I'll start that this evening

@jonnydgreen
Copy link
Contributor

@dragonfriend0013 a fix for this is now shipped in https://github.com/mercurius-js/mercurius/releases/tag/v8.1.3 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants