Skip to content
This repository has been archived by the owner on Sep 3, 2021. It is now read-only.

Declarative primary keys, constraints and indexes with @id, @unique, and @index directives #499

Merged
merged 27 commits into from
Aug 25, 2020

Conversation

michaeldgraham
Copy link
Collaborator

@michaeldgraham michaeldgraham commented Aug 23, 2020

Graph Key Management

This PR resolves #484 by introducing optional @id, @unique, and @index field directives for asserting unique property constraints and property indexes on node type fields, and for supporting declarative identification of a primary key field. Further, once we support generating input object mutation arguments (#497), we could better distinguish which arguments are used as keys. Fields with these directives could then be used for generating mutation API that supports using multiple keys for rich node selection in a declarative and performant way.

Current Behavior

Field Type Precedence

The primary key field arguments generated in the mutation API are used as graph keys when matching nodes. The getPrimaryKey function attempts to select a field of an ideal scalar type and nullability. The precedence of this selection follows:

  • Use the first ID! field.
  • If none exist, use the first ID type field.
  • If none exist, use the first ! scalar field.
  • If none exist, use the first scalar field.

Having a default primary key selected in this way ensures mutation API is generated during development. But the reliance on field ordering can also cause some confusion with those unfamiliar with this behavior. So it seems we should have an optional field directive for declaring a single field as a primary key.

New Behavior

@id Directive

Although Cypher allows for complex filtering when matching data, performance considerations often motivate the use of a single property as a key, with a unique property constraint set on it, which also sets a property index. If we were to add this behavior, but within the current decision logic, then we would be setting a unique property constraint in a non-declarative way.

A @unique directive could be used to declare a field as a primary key. But it should be possible to have multiple @unique fields to improve performance generally. So with multiple @unique fields, which field to use would again become ambiguous and fall back to using field type precedence to select a primary key. So a @unique directive has been implemented, along with a single-use @id directive for explicitly declaring a field as a primary key. The @id directive also sets a unique property constraint and index, but allows for disambiguation with multiple @unique directive fields when wanting to use a single key for generated mutation API node selection.

type Movie {
  id: ID
  movieId: ID @id # Primary key
}

@unique Directive

Both the @id and @unique directives support adding a unique property constraint to a scalar type field on a node type. These directives cannot be combined with each other or with @neo4j_ignore, @relation, @cypher, @relation type, temporal, or spatial type fields.

type Movie {
  id: ID! @unique
  movieId: ID! @id # Primary key
  title: String!
}
Using @unique and not @id

When an @id field is not provided and there is a @unique field, it is selected as a primary key:

type Movie {
  id: ID!
  movieId: ID! @unique # Primary key
  title: String!
}
Using only multiple @unique

If multiple @unique fields exist but no @id field, then which is used as a key depends on field type precedence. Below, the movieId field would be used as a primary key:

type Movie {
  id: ID!
  movieId: ID! @unique # Primary key
  title: String! @unique
}

@Index Directive

The apoc.schema.assert procedure used to support setting unique property constraints also supports setting only a property index. So to support the possibility of using an indexed, but not unique, primary key, an @index directive has been added.

type Movie {
  id: ID! @unique
  movieId: ID! @id # Primary key
  title: String! @index
}
Using @index and neither @id nor @unique

When @id and @unique are not used but @index is, it takes precedence.

type Movie {
  id: ID!
  movieId: ID! @index # Primary key
  title: String!
}
Using only multiple @index

Using only multiple @index fields is similar to using only multiple @unique fields, falling back to field type precedence, but at least starting out with the benefit of preferring an indexed field:

type Movie {
  id: ID!
  title: String! @index
  movieId: ID! @index # Primary key
}
Using only @index and @unique and not @id

When both @index and @unique are used, @unique takes precedence:

type Movie {
  id: ID! @index
  movieId: ID! @unique # Primary key
  title: String! @index
}

Using assertSchema

A new assertSchema export has been added to support getting all @id, @unique, and @index fields, generating the Cypher statement for calling apoc.schema.assert() with those fields used as values for the procedure's indexLabels and constraintLabels arguments, and sending that statement off to your Neo4j database using your driver. Below, we call assertSchema during server startup to sync our Neo4j indexes and constraints with those declared in our schema using the @id, @index, and @unique directives:

import {
  makeAugmentedSchema,
  assertSchema
} from 'neo4j-graphql-js';

const driver = neo4j.driver(...);

const schema = makeAugmentedSchema(...);

assertSchema({ schema, driver, debug: true });

// When debug = true, apoc.schema.assert() result is printed as a table:
┌─────────┬────────────────────┬────────────────┬────────────────────┬────────┬───────────┐
 (index)        label              key               keys         unique   action   
├─────────┼────────────────────┼────────────────┼────────────────────┼────────┼───────────┤
    0     'UniqueStringNode'  'uniqueString'  [ 'uniqueString' ]   true   'CREATED' 
    1        'UniqueNode'      'anotherId'     [ 'anotherId' ]    false   'CREATED' 
    2        'UniqueNode'        'string'        [ 'string' ]      true   'CREATED' 
    3        'UniqueNode'          'id'            [ 'id' ]        true   'CREATED' 
    4          'State'            'name'          [ 'name' ]      false   'CREATED' 
    5          'Person'          'userId'        [ 'userId' ]      true   'CREATED' 
    6        'OldCamera'           'id'            [ 'id' ]        true   'CREATED' 
    7        'NewCamera'           'id'            [ 'id' ]        true   'CREATED' 
    8          'Movie'          'movieId'       [ 'movieId' ]     false   'CREATED' 
    9          'Camera'            'id'            [ 'id' ]        true   'CREATED' 
└─────────┴────────────────────┴────────────────┴────────────────────┴────────┴───────────┘

Directive Precedence

In conclusion, a primary key would now be selected with the following precedence:

@id
  • If one @id field exists, use it
@unique
  • If no @id field exists, get all @unique fields
    • If one @unique field exists, use it
    • If multiple @unique fields exist, select one by type precedence
@index
  • If no @id field and no @unique fields exists, get all @index fields
    • If one @index field exists, use it
    • If multiple @index fields exist, select one by type precedence
Field type precedence
  • If no @id field exists and no @unique fields exist and no @index fields exist, get all scalar fields
    • Select one by type precedence (current default behavior)

In this way, we can be more explicit about keys moving forward while maintaining reasonable defaults that prioritize performance. The updated code for getting a node type primary key has been moved to a new file: src/augment/types/node/selection.js.

Tests

The following tests have been added to test/unit/cypherTest.test.js:

  • Create object type node with @id field
  • Create interfaced object type node with @unique field
  • Create object type node with @index field
  • Create object type node with multiple @unique ID type fields
  • Merge object type node with @unique field
  • Delete object type node with @unique field
  • Add relationship using @id and @unique node type for node selection
  • Merge relationship using @id and @unique node type fields for node selection
  • Remove relationship using @id and @unique node type fields for node selection

The following tests have been added to test/unit/assertSchema.test.js:

  • Call assertSchema for @id, @unique, and @index fields on node types
  • Throws error if node type field uses @id more than once
  • Throws error if node type field uses @id with @unique
  • Throws error if node type field uses @id with @index
  • Throws error if node type field uses @unique with @index
  • Throws error if node type field uses @id with @cypher
  • Throws error if node type field uses @unique with @cypher
  • Throws error if node type field uses @index with @cypher
  • Throws error if @id is used on @relation field
  • Throws error if @unique is used on @relation field
  • Throws error if @index is used on @relation field
  • Throws error if @id is used on @relation type field
  • Throws error if @unique is used on @relation type field
  • Throws error if @index is used on @relation type field
  • Throws error if @id is used on @relation type
  • Throws error if @unique is used on @relation type
  • Throws error if @index is used on @relation type

Future design considerations

Input object arguments

Once we add support for generating input object mutation arguments (#497), and with being able to declare multiple fields as unique and indexed, we could support generating input objects with multiple keys. Below, we have an input object argument of type _MovieWhere that has a single input field used as a primary key to select a node for an update mutation:

type Movie {
  id: ID!
  movieId: ID! @id
  title: String!
}
input _MovieWhere {
  movieId: ID
}
input _MovieData {
  title: String
}
type Mutation {
  UpdateMovie(where: _MovieWhere, data: _MovieData)
}

With the @unique directive, we could use multiple unique, indexed fields for keys:

type Movie {
  id: ID! @unique
  movieId: ID! @unique # or still @id
  title: String!
}
input _MovieWhere {
  id: ID
  movieId: ID
}
input _MovieData {
  title: String
}
type Mutation {
  UpdateMovie(where: _MovieWhere, data: _MovieData)
}

We could also use the @index directive if we want to allow for non-unique, indexed keys:

type Movie {
  id: ID! @unique
  movieId: ID! @unique
  title: String! @index
}
input _MovieWhere {
  id: ID
  movieId: ID
  title: String
}
input _MovieData {
  title: String
}
type Mutation {
  UpdateMovie(where: _MovieWhere, data: _MovieData)
}

Generated Documentation Comments

Once we implement #483 to add documentation comments (GraphQL AST descriptions), we could generate descriptions for mutation API field arguments and input object value definitions that specify whether an argument is for an @id, @unique, or @index field.

Federation @key Directive

It seems that assertSchema could be updated to also set unique property constraints for the scalar fields provided to @key directives when using Apollo Federation. This might save some development time considering that the fields used as keys in a federated schema are already expressed for a given entity using the @key type directive.

Generalizing initialization: syncSchema

The assertSchema function is the start to using some additional exports to help run initialization procedures in Neo4j for setting up constraints, indexes, etc. Once we add a @search directive to support #266, and perhaps a @rename directive for renaming node labels and properties, we could make a more general syncSchema export with some such configuration:

const config = {
  assert: true,
  search: true,
  rename: true
};

const config = {
  assert: {
    id: true,
    unique: true,
    index: true
  },
  search: { ... },
  rename: { ... }
};

export const syncSchema = (schema, driver, config) => {
  // call assertSchema if config.assert  
  // call searchSchema if config.search
  // call renameSchema if config.rename
};

adds validation function used during augmentation and within assertSchema to throw custom errors for inappropriate or redundant directive combinations
uses getTypeFields to update field set used for getting primary key for node selection input type generated for relationship mutations
uses a new schemaAssert function to generate the Cypher statement for calling apoc.schema.assert
also factors out an old inappropriate helper function, _getNamedType, replacing it with unwrapNamedType
removes relationship mutation tests that uses temporal field node selection
@michaeldgraham michaeldgraham changed the title Declarative primary keys, constraints and indexes with @id, @unique, and @index directives Declarative primary keys, constraints and indexes with @id, @unique, and @index directives Aug 23, 2020
@johnymontana johnymontana merged commit bdcb515 into neo4j-graphql:master Aug 25, 2020
@manonthemat
Copy link

wonderful!

@malikalimoekhamedov
Copy link

This sounds absolutely fabulous!
I just tried to use assertSchema as described but it turns out neo4j-graphql-js doesn't export it. Am I missing a point?

@michaeldgraham
Copy link
Collaborator Author

Hey there @antikvarBE, hopefully we can get things working 👍 Have you imported this assertSchema?

@malikalimoekhamedov
Copy link

@michaeldgraham , I'm using TypeScript so, according to the description of this thread, I should be importing it directly from 'neo4j-graphql-js' (instead of src). If I inspect current type definitions, I don't see assertSchema:

declare module 'neo4j-graphql-js' {
  import * as neo4jGraphQLJS from 'neo4j-graphql-js';
  import { GraphQLSchema } from 'graphql';

  const makeAugmentedSchema: (_ref3: any) => GraphQLSchema;
  const neo4jgraphql: (
    object: any,
    params: any,
    context: any,
    resolveInfo: any,
    debugFlag?: any,
    ...args: any[]
  ) => any;

  export { makeAugmentedSchema, neo4jgraphql };
}

@michaeldgraham
Copy link
Collaborator Author

Just to make sure - what version of neo4j-graphql-js are you using?

@malikalimoekhamedov
Copy link

Latest 16.1

@michaeldgraham
Copy link
Collaborator Author

I was just able to get it working locally with a fresh install, but not with Typescript, as I don't have a setup for that right now. So I'm thinking it might be a Typescript issue somehow? There's nothing special about the function format of the assertSchema export, so I'm not sure why it wouldn't be showing up for you :(

@malikalimoekhamedov
Copy link

malikalimoekhamedov commented Aug 28, 2020

@michaeldgraham , sorry. Totally and completely my fault. I forgot I had my own type definitions file I created manually as neo4j-graphql-js doesn't supply any. So, every time there's something new, I have to update it myself (which I now did). Typechecking now works fine. However, the following chunk of code doesn't print the beautiful assertion table:

const knowledgeGraphSchema = makeAugmentedSchema({
  typeDefs: knowledgeGraphTypeDefs
});

assertSchema({ schema: knowledgeGraphSchema, driver, debug: true });

For the sake of a test, I decorated some things in my schema with @id, @unique and @Index. I'm testing it by running the Apollo GraphQL server in an offline AWS lambda environment.


UPDATE

I'm running into the following error when trying to call assertSchema as mentioned above:

Neo4jError: Failed to invoke procedure `apoc.schema.assert`: Caused by: IndexEntryConflictException{propertyValues=( String("") ), addedNodeId=12660, existingNodeId=1091}
: 
    at captureStacktrace (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/result.js:275:15)
    at new Result (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/result.js:66:19)
    at newCompletedResult (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/transaction.js:446:10)
    at Object.run (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/transaction.js:285:14)
    at Transaction.run (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/transaction.js:121:32)
    at /Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-graphql-js/dist/index.js:451:17
    at TransactionExecutor._safeExecuteTransactionWork (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/internal/transaction-executor.js:132:22)
    at TransactionExecutor._executeTransactionInsidePromise (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/internal/transaction-executor.js:120:32)
    at /Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/internal/transaction-executor.js:59:15
    at new Promise (<anonymous>)
    at TransactionExecutor.execute (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/internal/transaction-executor.js:58:14)
    at Session._runTransaction (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/session.js:300:40)
    at Session.writeTransaction (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-driver/lib/session.js:293:19)
    at executeQuery (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-graphql-js/dist/index.js:450:20)
    at Object.assertSchema (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/neo4j-graphql-js/dist/index.js:469:10)
    at eval (webpack-internal:///./src/graphql.ts:23:20)
    at Object../src/graphql.ts (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/build/service/src/graphql.js:253:1)
    at __webpack_require__ (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/build/service/src/graphql.js:20:30)
    at /Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/build/service/src/graphql.js:84:18
    at Object.<anonymous> (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/build/service/src/graphql.js:87:10)
    at Module._compile (internal/modules/cjs/loader.js:1075:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1096:10)
    at Module.load (internal/modules/cjs/loader.js:940:32)
    at Function.Module._load (internal/modules/cjs/loader.js:781:14)
    at Module.require (internal/modules/cjs/loader.js:964:19)
    at require (internal/modules/cjs/helpers.js:88:18)
    at /Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/serverless-offline/dist/lambda/handler-runner/in-process-runner/InProcessRunner.js:80:133
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at InProcessRunner.run (/Users/antikvar/Projects/Zvook/Development/apollo-graphql-server-lambda/node_modules/serverless-offline/dist/lambda/handler-runner/in-process-runner/InProcessRunner.js:80:9) {
  code: 'Neo.ClientError.Procedure.ProcedureCallFailed'
}

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Needs docs Document the features added in this PR better
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optional unique directive for primary key selection
4 participants