Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neo4j support #1575

Open
marktani opened this issue Jan 16, 2018 · 38 comments
Open

Neo4j support #1575

marktani opened this issue Jan 16, 2018 · 38 comments
Labels
domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. domain/schema Issue in the "Schema" domain: Prisma Schema, Introspection, Migrations etc. kind/feature A request for a new feature. topic: connector

Comments

@marktani
Copy link

This feature requests serves as a central place to discuss development and progress for the Neo4j connector.

@samueledirito
Copy link

I would like to contribute. Do you have any guideline to follow? TIA

@sorenbs
Copy link
Member

sorenbs commented Jan 17, 2018

@samueledirito It would be great to collaborate on this.

  • The first step is to map out the kind of queries a Neo4j Connector should support.
  • The next step would be to plan in detail whet the resulting API should be. Our current API is designed to expose the power of relational databases, and I'm not sure trying to fit a graph database to the same API is a good idea.

Feel free to dump your thoughts (well formed or not) in this thread so we can get the conversation going.

@d3r1v3d
Copy link

d3r1v3d commented Jan 17, 2018

Would really like to use this in an enterprise application my team is building. Willing to help in any way to push this forward.

@rohanray
Copy link

https://github.com/neo4j-graphql/neo4j-graphql-js is a nice place to brainstorm initial design

@sorenbs
Copy link
Member

sorenbs commented Jan 28, 2018

Thanks @d3r1v3d!

At this stage the best way to help is to contribute the following two things (either here or as a slack dm if it is confidential)

  1. Very concrete use case description.
  2. Initial suggestion for syntax for SDL and the generated GrpahQL API.

When we prioritise the connectors to build community validation plays a big role so providing this will ensure that the neo4j connector will get build sooner.

@peterclemenko
Copy link

If I may ask, since I'm interested as well, what are you looking for with regards to syntax for SDL? I'd love to use Prisma but need neo4j out of the box (and would prefer to use prisma + stitching + gramps) but need to know more about what is needed to get this off the ground. I'll probably have to backport to Prisma when it's time, but that will hopefully not be too hard given the architecture I've designed for my application.

@sorenbs
Copy link
Member

sorenbs commented Jan 28, 2018

@AoiGhost

Neo4J is untyped and GraphQL is typed. We need to decide how the developer should specify the mapping. Prismas MySQL connector generates a powerful CRUD API based on a simple SDL type definition. neo4j-graphql takes a similar approach in addition to allowing custom resolvers implemented by Cypher queries.

I'm interested to understand if this is the best approach or if there is some other model that would be better. If this is the best approach, then we need to map out the SDL syntax required to put type definitions on all aspects of the Graph.

Neo4j has nodes and edges. I'm not quite sure, but I think edges can have a direction as well as properties, so the SDL must be expressive enough to handle this.

It might very well be that Prisma does not need to support all the features of Neo4j to be useful. That's why I would like to see some very concrete use case descriptions, including the schema represented in a mock SDL as well as the queries required to implement the use case.

@jexp
Copy link

jexp commented Jan 29, 2018

In neo4j-graphql we use the GraphQL schema as the definitive type mapping

(optionally it can also be generated from the existing data in the graph)

The graphql schema + your directives (like @relation(with direction), @isUnique) are already enough to represent Neo4j's "types".
Multiple labels can be reprented by interfaces or by union types.

@peterclemenko
Copy link

peterclemenko commented Feb 1, 2018

Edges with direction are supported in neo4j. Use case for what I'm doing is the following:
Application pushes and pulls data from nodes, connected by edges. Looks for connected nodes on edges to see if things are related as well as to a certain depth. Also supported is pulling all of N type nodes, or E type edges. Cypher, mutations, and subscriptions would all be useful.

That's pretty much all I need. Read/Write including seeing what nodes and edges are related to n depth along with subscriptions and mutations.

Pretty sure @jexp is right on the ball with this.

@GBR-422777
Copy link

I am willing to help on this too, any clue when the development might start? what is still needed?

@appinteractive
Copy link

I'm really waiting for Neo4j support! GraphQL + Graph DB's is like a dream comes true! How far is the process on this?

@ivan-kleshnin
Copy link

ivan-kleshnin commented Oct 23, 2018

I'm really waiting for Neo4j support! GraphQL + Graph DB's is like a dream comes true!

If all databases are reduced to a lowest common denominator and you are unable to use database specific features (including Cypher language) – what's the point? Really, I'm asking with no irony, – what's the difference between Neo4j and AnySQL in this case?

Edit: I missed the point that Prisma.io doesn't own/host your database. So the question is canceled.

@victorleme
Copy link

How is going this?
I want to help

@stale
Copy link

stale bot commented Jan 8, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.

@appinteractive
Copy link

Ping

@chancesmith
Copy link

Ping again

@jonasdumas

This comment has been minimized.

2 similar comments
@Matthewjarredondo

This comment has been minimized.

@GoodFaithParadigm8

This comment has been minimized.

@TBlackford
Copy link

Any updates on this?

@jonasdumas
Copy link

Any update? Any idea around when the connector could be ready?
Q2 2019, 2020... 2030 ? :)

@mircicd
Copy link

mircicd commented Dec 9, 2019

+1

@bionicles
Copy link

Would use

@deomaius
Copy link

I've just stumbled on you guys and this aspect alone has sold me. Looking forward to seeing it as a reality.

@Ungolim
Copy link

Ungolim commented Jan 5, 2020

🥇

@2color 2color transferred this issue from prisma/prisma1 Feb 11, 2020
@2color 2color changed the title Neo4j Connector Neo4j support for Prisma 2 Feb 11, 2020
@2color 2color added the kind/feature A request for a new feature. label Feb 11, 2020
@janpio janpio changed the title Neo4j support for Prisma 2 Neo4j support Mar 8, 2020
@fireb1001
Copy link

Would be great

@aklen

This comment was marked as off-topic.

@nargetdev
Copy link

Also interested to contribute on this.

@ilmimris
Copy link

When is support expected to be released? a week? a month?
Thanks 👍

@janpio janpio added team/product tech/engines Issue for tech Engines. labels Apr 30, 2020
@xero0001
Copy link

I'm building an server that requires some machine learning features entirely depends on neo4j.
I also have been using prisma from its alpha state.
Nothing will be better if prisma supports neo4j!

@nargetdev
Copy link

nargetdev commented Sep 26, 2020

Yea so I've been getting a bit more into this now

The first thing to understand is that the functionality we're all looking for already exists in an abstract sense via GRANDstack

... let me explain.
So I started out using Prisma backed by postgres. Defined my schema, built a social network application and got all hydrated with user data. THEN I discover Neo4j.. a friggen beautiful piece of art in the form of a Database

As it turns out in nearly entirely turn-key "just works" fashion you can both import your data from Postgres --> Neo4j , and then Infer a GraphQL Schema From the Existing Neo4j Database we just made

..and have a nearly functionally equivalent neo4j backed GraphQL endpoint in very little time (for me it took a couple days for a fairly involved application) this is obviously a punchline.

... as such I would recommend anybody here serious about creating the Neo4j connector with Prisma should try out GRANDStack and get some experience looking through that lens in order to see GRANDstack's "big picture" juxtaposed with Nexus's

Now I've come to understand that it's not a mutually exclusive GRANDStack - or - Nexus Framework decision. Fundamentally each project has a distinct area of focus.

That said they certainly overlap. For instance both handle schema definition/construction. However Nexus achieves this via code. And the neo4j-graphql-js graphql middleware is bootstrapped via GraphQL SDL. Both result in a graphql endpoint that I can generate TypeScript types from for nearly compatible typesafe usage on my frontend.

Anyways this is just an intro. I suppose it's also worth noting that I was running a Kubernetes cluster for my Nexus Framework endpoint and Prisma Postgres instance, and now I'm literally running a single google cloud run serverless service which hosts the neo4j-graphql-js middleware. The Neo4j instance I outsourced to their own Neo4j managed "Aura" (provisioned through GCP)

I sense intuitively that this integration should/will happen. Graph-native databases are fundamental staple of modern software bag-o-tricks, meaningful Nexus-Prisma-GRANDstack integration is a must. As I get deeper into GRANDstack I'd like to really seriously ask the question what this Neo4j connector could look like. Especially the "WHY?" And then the "HOW?" and ultimately the concrete "WHAT?"

IF NOTHING ELSE CLICK TO SEE THE DATA VIS I GOT OUT OF THE BOX FOR VERY LITTLE TIME INVESTED

@matthewmueller matthewmueller added domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. domain/schema Issue in the "Schema" domain: Prisma Schema, Introspection, Migrations etc. labels Jan 14, 2021
@janpio janpio removed the tech/engines Issue for tech Engines. label Apr 1, 2021
@hiddendragonXVII
Copy link

In grand stack the graphql is discovered from the database contents, can we also generate the prisma schema from that?

@smartniz

This comment was marked as off-topic.

@danstarns
Copy link
Contributor

In my opinion, the key differences between Prisma’s existing supported connectors and those Graph types such as; Neo4j, ArangoDB, and Amazons Neptune is the way you think and model your data plus support for more sophisticated features such as; Relationship properties, pattern matching, and recommendations.

Neo4j has already done a great job of showcasing how to generate a GraphQL API, the Cypher(Neo4j’s query language) behind that, and even exposing an ORM/OGM(Object Graph Mapper) for the same ecosystem, we should look at their efforts for inspiration here https://github.com/neo4j/graphql.

Looking further into modeling, given we take the schema from Prisma’s data model docs https://www.prisma.io/docs/concepts/components/prisma-schema/data-model:

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

generator client {
  provider = "prisma-client-js"
}

model User {
  id      Int      @id @default(autoincrement())
  email   String   @unique
  name    String?
  role    Role     @default(USER)
  posts   Post[]
  profile Profile?
}

model Profile {
  id     Int    @id @default(autoincrement())
  bio    String
  user   User   @relation(fields: [userId], references: [id])
  userId Int
}

model Post {
  id         Int        @id @default(autoincrement())
  createdAt  DateTime   @default(now())
  title      String
  published  Boolean    @default(false)
  author     User       @relation(fields: [authorId], references: [id])
  authorId   Int
  categories Category[] @relation(references: [id])
}

model Category {
  id    Int    @id @default(autoincrement())
  name  String
  posts Post[] @relation(references: [id])
}

enum Role {
  USER
  ADMIN
}

image

This schema represents a relational model using a SQL-based provider, and for us to change this to represent Neo4j’s labeled property graph model there are a few key changes that would need to happen.

Firstly, Relationships are First Class

There are no foreign keys or joins in Neo4j and or Graphs in general. Instead, graphs embrace the concept of a relationship and in Neo4j’s case, they are ‘First Class Entities'. Let's take a subset of the data model above and dissect it to see how it would compare to that If we were using Neo4j.

model User {
  id      Int      @id @default(autoincrement())
  email   String   @unique
  name    String?
  role    Role     @default(USER)
  posts   Post[]
}

model Post {
  id         Int        @id @default(autoincrement())
  createdAt  DateTime   @default(now())
  title      String
  published  Boolean    @default(false)
  author     User       @relation(fields: [authorId], references: [id])
  authorId   Int
}

In this subset we define a @relation(fields: [authorId], references: [id]) on the Post model, this relationship is ‘linked’ together by appending the corresponding authorId field on a Post, and at runtime, when you want to query a user along with its posts, a join populates the nested data.

Previously mentioned that in Neo4j a join does not exist and instead, you should, when using graphs, rely on the relationships and the mechanisms for querying them. Here I will show you an abstract model representing the same schema above using a labeled property graph:

image

Notice, firstly, that a Post now does not contain any authorId field, and instead, we introduce a ‘label’ on the relationship itself HAS_POST.

Below I introduce you to my idea of what the Prisma schema would look like for supporting Neo4j.

Firstly, the keyword model is used to represent a Node, just like how it represents a Document using Mongo or a Tabel otherwise.

Then, I introduce the@edge attribute, it will work similarly to how we already define relationships in Prisma. I have decided to recommend using something new such as this attribute over using the existing @relation because semantically this will make it very clear that you are working with a graph plus it will provide a platform to support more sophisticated features such as relationship properties.

model User {
  id      Int      @id @default(autoincrement())
  email   String   @unique
  name    String?
  role    Role     @default(USER)
  posts   Post[]
}

model Post {
  id         Int        @id @default(autoincrement())
  createdAt  DateTime   @default(now())
  title      String
  published  Boolean    @default(false)
  author     User       @edge(type: "HAS_POST", direction: IN)
}

The key differences are that; authorId field no longer exists on a Post, and the @edge directive is used where it specifies a type and a direction.

So in all if we wanted to model this entire schema into an abstract graph model we could represent it as so:

image

Then this would translate into the following Prisma schema:

datasource db {
  provider = "neo4j"
  url      = env("DATABASE_URL")
}

generator client {
  provider = "prisma-client-js"
}

model User {
  id      Int      @id @default(autoincrement())
  email   String   @unique
  name    String?
  role    Role     @default(USER)
  posts   Post[]
  profile Profile?
}

model Profile {
  id     Int    @id @default(autoincrement())
  bio    String
  user   User   @edge(type: "HAS_PROFILE", direction: IN)
}

model Post {
  id         Int        @id @default(autoincrement())
  createdAt  DateTime   @default(now())
  title      String
  published  Boolean    @default(false)
  author     User       @edge(type: "HAS_POST", direction: IN)
  categories Category[] @edge(type: "IN_CATEGORY", direction: OUT)
}

model Category {
  id    Int    @id @default(autoincrement())
  name  String
  posts Post[] @edge(type: "IN_CATEGORY", direction: IN)
}

enum Role {
  USER
  ADMIN
}

Diff:

datasource db {
-  provider = "postgresql"
+  provider = "neo4j"
  url      = env("DATABASE_URL")
}

generator client {
  provider = "prisma-client-js"
}

model User {
  id      Int      @id @default(autoincrement())
  email   String   @unique
  name    String?
  role    Role     @default(USER)
  posts   Post[]
  profile Profile?
}

model Profile {
  id     Int    @id @default(autoincrement())
  bio    String
-  user   User   @relation(fields: [userId], references: [id])
+  user   User   @edge(type: "HAS_PROFILE", direction: IN)
-  userId Int
}

model Post {
  id         Int        @id @default(autoincrement())
  createdAt  DateTime   @default(now())
  title      String
  published  Boolean    @default(false)
-  author     User       @relation(fields: [authorId], references: [id])
+  author     User       @edge(type: "HAS_POST", direction: IN)
-  authorId   Int
-  categories Category[] @relation(references: [id])
+  categories Category[] @edge(type: "IN_CATEGORY", direction: OUT)
}

model Category {
  id    Int    @id @default(autoincrement())
  name  String
-  posts Post[] @relation(references: [id])
+  posts Post[] @edge(type: "IN_CATEGORY", direction: IN)
}

enum Role {
  USER
  ADMIN
}

Secondly, Relationship Properties

Relationships can have properties/fields, and this is a completely different concept from what anyone using Prisma is used to. To explain this concept I'm going to introduce you to a new data model because the current model itself doesn't need relationship properties.

Here we have a simple ‘Movies’ example, you will find this data model all over Neo4j’s docs and so I find it relevant to use here, below I represent it firstly using the proposed Prisma schema and then in an abstract form:

model Movie {
  title       String
  imdbRating  Int
  actors      Person[] @edge(type: "ACTED_IN", direction: IN)
}

model Person {
  name  String
  age   Int
}

image

We have already established that ‘relationships are first class’ in Neo4j and so that means they have mostly the same features as nodes. Given this model, what if we wanted to record what roles a given actor played in a movie? If we were using SQL for example we may make an intermediary ‘metadata’ table that contains the role information and facilitates the connection of the data. However, with relationship properties, this does not need to happen.

image

In this abstract model, you will see that now the relationship itself contains properties and so we need some way to model this in the Prisma schema. Below I propose my solution to this, it uses the already mentioned @edge attribute, where inside the new properties key each property is defined. You can think of a relationship property as a field on the relationship and with that, they should support all the same datatypes as those on nodes.

model Movie {
  title       String
  imdbRating  Int
  actors      Person[] @edge(type: "ACTED_IN", direction: IN, properties: { roles String[] })
}

model Person {
  name  String
  age   Int
}

Now that we have a way to represent properties on relationships we also now need to establish a way of reading and writing those properties. Given for example we use the Prisma client and the current way to read and write data:

await prisma.movie.findMany({
  where: { title: 'Forrest Gump' },
  select: {
    title: true,
    imdbRating: true,
    actors: {
      // What about the relationship properties?
      select: {
        name: true,
        age: true,
      },
    },
  },
})

await prisma.movie.create({
  data: {
    title: 'Forrest Gump',
    imdbRating: 8.6,
    actors: {
      create: [
        // What about the relationship properties?
        {
          name: 'Tom Hanks',
          age: 70,
        },
      ],
    },
  },
})

What about the relationship properties? Where do we interact with them?

Here I would like to point you to the Relay Connections Spec https://relay.dev/docs/guides/graphql-server-specification/#connections, where they have established a way to model relationships along with their metadata in GraphQL, and I believe that we could model that exactly in Prisma. Given we use the relay spec, here is the above read and write using it:

await prisma.movie.findMany({
  where: { title: 'Forrest Gump' },
  select: {
    title: true,
    imdbRating: true,
    actors: {
      select: {
        edges: {
          roles: true,
          node: {
            name: true,
            age: true,
          },
        },
      },
    },
  },
})

await prisma.movie.create({
  data: {
    title: 'Forrest Gump',
    imdbRating: 8.6,
    actors: {
      create: [
        {
          edge: {
            roles: ['Forrest Gump'],
            node: {
              name: 'Tom Hanks',
              age: 70,
            },
          },
        },
      ],
    },
  },
})

One thought I would like to leave here that's relevant not just even in the case of a Neo4j provider. Should Prisma think of changing its API to support this model, as It opens up the ability to support more DB types, provides flexibility to paginate on nested relations/edges, and finally provides a platform to read and write any other metadata along with the relation?

Finally, Graph Capabilities

Graphs are very powerful, firstly you can use them in a general-purpose way just like any other DB you use but they come with some superpowers. The most spoken about superpower that graphs ship with is the ability to recommend data. Here we look at ways Neo4j has exposed this superpower via its GraphQL lib, and then leave room for others to bring ideas and suggestions to the table.

What are recommendations all about?

Given in Neo4j you want to query an actor and its movies you could write something like:

MATCH (tom:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(movie:Movie)
RETURN actor {.*, movies: collect(movie)} as actor

This would return a JSON-like structure with each actor and its connected movies, however, what's really powerful is that you can use this same 'pattern matching' and path to grab for example 'co actors' traversing the graph to other actors:

MATCH (actor:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person)
RETURN actor {.*, coActors: collect(coActor)} as actor

Read more about recommendations here: https://neo4j.com/developer/cypher/guide-build-a-recommendation-engine/

The questions now are

  1. How do we model this using Prisma schema?
  2. How would you query it?
  3. Is it something Prisma should support?

Looking at how Neo4j does it

The Neo4j lib https://github.com/neo4j/graphql shares some similarities with Prisma. On their side, you define a GraphQL schema with annotations about the underlying data model and then you get a generated GraphQL API. The methods on that GraphQL API are a very similar feature set to what Prisma exposes. With this particular problem, recommendations, they have an 'escape hatch 'the cypher directive'. Below I show you a simple GraphQL schema:

type Movie {
    title: String
    imdbRating: Int
    actors: [Person] @relationship(type: "ACTED_IN", direction: IN)
}

type Person {
    name: String
    age: Int
}

Then if you were to pop this into the Neo4j engine you could for example query for the actors and their movies:

query {
    actors {
        name
        age
        movies {
            title
            imdbRating
        }
    }
}

If you wanted to use the escape hatch and introduce recommendations into your schema you annotate the schema with the Cypher directive and write custom logic:

type Movie {
    title: String
    imdbRating: Int
    actors: [Person] @relationship(type: "ACTED_IN", direction: IN)
}

type Person {
    name: String
    age: Int
    coActors: [Person] @cypher(statement: """
        MATCH (this)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person)
        RETURN coActor
    """)
}

Then you can query the coActors this just like any other field where the predefined Cypher is injected into the generated query:

query {
    actors {
        name
        age
        coActors {
            name
            age
        }
    }
}

If Prisma were to adopt this approach, It could look very similar, given we have our already established data model, lets add the Cypher attribute in there:

model Movie {
  title       String
  imdbRating  Int
  actors      Person[] @edge(type: "ACTED_IN", direction: IN, properties: { roles String[] })
}

model Person {
  name  String
  age   Int
  coActors Person[] @cypher(statement: """
        MATCH (this)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(coActor:Person)
        RETURN coActor
    """)
}

Then, you could use PrismaClient to select that field:

await prisma.person.findMany({
  where: {
    name: 'Tom Hanks',
  },
  select: {
    coActors: {
      select: {
        name: true,
      },
    },
  },
})

This is a great approach that would allow users to cover all their complex use cases, however, listed below are some issues with this.

Problems with this approach

  1. You can't tell what the user will write
  2. It may not be the right level of abstraction for Prisma users
  3. Brings the DB mechanics to the schema

Generating Reccomendations

Above we dived into the Cypher directive/attribute however could there be a possibility to pre-generate a generic enough set of recommendations? Given our schema do we have enough information to auto-generate the coActors field, and what would that look like?

Of course, the field would not be called coActors as that's a general construct we have given this particular graph traversal, however, could the API expose enough fields/input for users to specify a particular graph traversal?

Summary

As someone who is part of the Prisma team I would like to express that this comment represents no commitment from Prisma or any members of the Prisma team to support Neo4j. To add another connector requires an immense amount of engineering effort and there are soo many unknowns and gotchas with doing so that this comment doesn't do justice. I felt qualified to share my opinions and finding on this topic seeming that I previously came from working in the Neo4j team, on the GraphQL library and so I feel as though my understanding and learnings from doing so can, and should be shared here.

@christtrc

This comment was marked as off-topic.

2 similar comments
@dibstern

This comment was marked as off-topic.

@the-zimmermann

This comment was marked as off-topic.

@jpwogaman
Copy link

this is a great write-up!! gave me a lot to think about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. domain/schema Issue in the "Schema" domain: Prisma Schema, Introspection, Migrations etc. kind/feature A request for a new feature. topic: connector
Projects
None yet
Development

No branches or pull requests