Skip to content

Latest commit

 

History

History
444 lines (420 loc) · 17.8 KB

2022-08-17.md

File metadata and controls

444 lines (420 loc) · 17.8 KB

Composite Schemas WG - August 2022

Watch the replay: Composite Schemas Working Group Meetings on YouTube

Determine volunteers for note taking (1m, Benjie)

  • Jonny

Introductions

Review agenda (2m, Benjie)

  • Not many items added
  • Subschema composability rules (Uri)
  • Next steps for the glossary (Laurin)

Review previous meeting's action items (5m, Benjie)

  • No explicit action items
  • Pendrag:
    • Get a sense of who looked at composability rules

Next steps for subschema composability rules (15m, Uri)

  • Uri:
    • Not a lot to add on top of what Jacob had to say
    • Very interesting
    • Hope to take the conversation to what would be the next steps
  • Yaacov
    • Would love to hear people’s feedback
    • Thoughts already expressed in PR
    • There may be errors since it’s a first draft
    • The people weigh in, the better
    • Kept it broad
      • Execution as a single vs execution to the subschemas that were the objects of the composition
      • Single-service model being much more flexible
    • Thoughts for next steps
      • Work on some code to validate
      • (Cut out, need to re-listen to get notes here)
  • Benjie

  • Yaacov
    • Federation is more strict rather than at the current time
    • Don’t think it’s a good idea to be lax on things like enums
    • Merging on some other gateways is only on object types which is more strict
    • Define an upper bound for what breaks composability
    • Then add layers which contain warnings
  • Matteo:
    • The way namespacing and transformations happen seems quite heavy
    • Something that’s missing is the concept of a namespace/modules
    • Maybe possible to add this concept? Instead of renaming
  • John:
    • No namespacing allows us to present a unified schema
    • Nice to have it all in one simple flow
  • Julian
    • Maybe can treat namespacing as an internal thing
  • Paolo
    • Maybe treat namespacing to treat at build time rather than run-time
  • Laurin:
    • Might be no need to namespacing types
    • But rather, merge/compose them
  • John:
    • Maybe it is good to opt-in
    • If designing a schema that is intended to be composed does this define meanings of types
  • Laurin:
    • Needs to be some metadata
    • Identifying use-cases where they can be merged
  • Paolo:
    • When you are talking about metadata are you talking type name
    • Will it use introspection?
  • Derek:
    • Outside of this working group, but there is some work to get this information
    • Not yet available but there is an active proposal
    • Opened a couple of weeks ago
  • Paolo
    • People should be able to reach out and say give me the data
    • Should not rely on introspection
      • Query might be never ending
  • Pendrag
    • I’ve had that problem and share concern
  • Derek
    • Slow introspection query issue is when using for run-time composition
    • Not an issue for build-time composition
  • Predrag
    • Can be forced to evaluate introspection at run-time
  • Paolo
    • Should we have rules for namespacing?
    • Or not?
  • Yaacov
    • Important to maintain the ability to compose schema that you do not manage
    • Might want to hide the fact that that data is remote
    • Create a unified APi from remote that don’t control
  • Paolo
    • Composition engine can be combined in many ways which might not have the same rules
    • Something must compose the schema
  • John
    • Maybe a dividing line between schema ownership
  • Derek
    • Maybe things need to be renamed if defining the same type
  • Paolo
    • This goes against composing a schema you don’t control
  • Yaacov
    • May be talking about different things
    • Composing sub-schemas don’t control
      • May need transformations
      • And namespacing is a technique
    • Interested in subschemas that don’t control and merging these is important
    • Namespacing is a type of transformation that is helpful for composing subschemas
    • Don’t have to be defined by subschema itself
  • Paolo
    • Case when don’t control is the most complex
    • If we solve this, others are then solved as well
  • Laurin
    • What’s a use-case?
  • Matteo
    • Exposing a single graph with no namespace info is more user friendly
    • Masks the complexity of doing this merging
    • If 2 graphs with 2 teams, they need to coordinate a lot in reality
    • Can see the fact that in certain companies that expose some concept of namespacing they are exposed in a way that is clear where it came from
    • Namespacing could avoid a lot of conflicts and have things work out of the box in a lot of cases
    • Concern is that at the moment, there are a lot of things that can go wrong and not simple to maintain
  • Joey
    • Another use-case that is related
    • Sometimes you do want merge that that you have control over but don’t want to control
    • E.g. Let people onboard there own namespace and watch for commonalities
    • Do want to merge things that we technically control but don’t actually want to control
  • Paolo
    • Assume schema is split
    • Let’s have a User service and Book service
    • Should not expect that the user service can talk to the book service
  • Predrag
    • Can happen in the real world where services talk to each other
    • Metadata service that gives execution plan
    • Then all done client side
    • Composition logic is housed in a service but execution is client side
    • Helpful to not assume that the schema operations have to be done by service
    • Metadata for how the composition happens should be able to execute on both client and server
  • Paolo
    • This is amazing
    • There is also a synchronization problem? Or maybe not?
    • Could produce a concurrency problem that could go bad?
  • Predrag
    • It makes concurrency problems more obvious
  • Paolo
    • I see your point
  • Yaacov
    • Not aware of existing composition that works like that
    • Seems like a more mesh-like approach
    • Might run into looping problems
    • Another thing that your question touches on is, is the subschema a composed schema
  • Paolo
    • I see that a subschema can be composite
  • Benjie
    • This is a really interesting discussion
    • Sometimes compose schemas that do or don’t control
    • One of things we come back to is there description of data and metadata
    • Open work on managing this
    • Can’t put metadata into a remote schema that don’t control
    • Will say the topic of explicit namespaces has been proposed a few times at the working group
      • Not made much progress
      • Bar for adding it quite high
      • Reason: GraphLQ focussed on schema that is designed to be consumed
      • Shouldn’t necessarily be exposing these complexities
    • Another thing is automatically renaming things
      • One thing that comes out is an explosion of types
      • Better to have explicit rules for these
    • Would like to continue discussion but let’s move onto glossary and circle back at the end

Next steps for the glossary (10m, Laurin)

  • Agenda:

    • splitting it into glossary + appendix for branded terms and appendix for actual implementations
    • call for feedback and reviews
  • Laurin

    • Last WG we all agreed terminology is currently very vague
    • Idea is to create glossary and unified terms that everyone can use
    • Created a PR based on an internal doc
    • Right now, a collection of terms describing stuff
    • Also contains branded terms such as apollo federation
    • Want to ask for feedback and add missing stuff
    • Next step is to divide it a bit
      • Branded terms sep. doc?
      • Appendix?
      • Open Source/Hosted solutions
      • Etc
    • Maybe pick the correct solutions from the provided ones
  • Benjie

    • Excellent work Laurin
    • Great to have shared vocab
    • As you point out, separating the branded would be beneficial
    • More generic terms could be use in whatever form of spec we decide to build
    • E.g. Came up with subschema last month
    • What do you think the next step is?
  • Laurin

    • Yes, feedback
    • And then for me, the next step is to split it into bits
    • And figure where we want to put this stuff
    • MAybe someone has a good idea
  • Benjie

    • Propose to split into 2 files as discussed and use the RFCs folder
    • Advantage of having it merged is people can send PRs to it and iterate quicker that way
    • RFCs folder has a low bar top getting things merged to it
      • Could use an alternative folder (e.g. docs)
      • But might add additional weighting which it is not ready for
    • RFCs is good bc not official
    • Wdyt
  • LAurin

    • Sounds good
  • Benjie

    • Would you like to take as an action item?
  • LAurin

    • Sounds good
  • Predrag

    • What about an anti-glossary?
  • Benjie

    • Brilliant idea
    • Yaacov did the same and would be very beneficial
  • Predrag

    • More technical q for laurin and benjie
    • Most things are open source in the glossary, what about non-open source contributions that you can’t reference explicitly
  • Benjie

    • Would like to gather use-cases
    • Such as existing solution (apollo fed, schema stitching)
    • A doc on each of these would be beneficial
    • Can find commonalities and standard metadata
    • Where should we put it? Another folder like RFCs folder with README
    • Would you like that action item Predrag?
  • Predrag

    • Yeah definitely
    • Happy to add another entry to reference blog posts etc within the existing document
    • Whatever you prefer
  • Benjie

    • Any other thoughts?
  • John

    • Makes sense
  • Joey

    • We use the term namespacing a lot but doesn’t always mean the same thing
    • Maybe we address this?
  • Predrag

    • Approach in rust, if lack of consensus, use an explicit term that is not standardised
  • John

    • Sounds like there was also prefixing and metadata driven one as well
  • Benjie

    • 3 docs:

      • One is glossary of generic terms
      • One which is a list of vendor terms
        • People can add their own things etc
        • Reference projects
      • Terms to avoid
      • Glossary.md (generic terms)
    • VendorSolutions.md (the solutions that are out there)

    • TermsToAvoid.md (branded/misleading/ambiguous terms)

    • Want to talk about concept of data
    • Is everything we need to do, doable from metadata or do we sometimes need code as well?
    • Is there a need for code or will metadata suffice?
  • Paolo - I missed this, Paolo could you provide your answer if possible please? Thank you!

  • Predrag

    • Should be doable just as data and not require code
  • Benjie

    • Think we’re done on glossary topic, let’s go to previous topic
    • Want to capture action items, is there anything else?
  • Matteo

    • Want to talk about previous subject
    • If I understand correctly, what Yaacov is doing is finding the most tools that we can agree on to compose schema
    • Then we can think separately if it’s composable or not
    • I like what Yaacov has done in identifying what does it mean to compose a schema
    • Create an action item to create something like a test suite
      • What is legal
      • What is not legal
    • People can confirm if they agree with it
    • Can then come up with all sorts of use-cases for this
  • Benjie

    • It would be valuable to find out why rules don’t match
  • Paolo

    • To clarify, is this something that would eventually become a test suite
  • Matteo

    • Yes
  • Uri

    • Other people can look at this and confirm that this makes
  • Benjie

    • Definitely an action time for everyone to look at Yaakov's proposal
  • Yaacov

    • Do people want me to work on a code implementation?
    • Some kind of reference implementation that exists or shouldn't
  • Predrag

    • Test cases should live in RFC
    • As for implementation, no strong feelings other than what language
  • Yaacov

    • I was thinking In JavaScript
  • John

    • I agree
    • Given a, b, c and does it compose d
  • Benjie

    • We need to be aware of not limiting innovation
    • And make sure it’s not the only way of doing x
    • We want to identify what are the common cases and write those down
  • Paolo

    • I completely agree
  • Predrag

    • I saw it as if you have these subschema they are composable in x situation and not in y situation
  • John

    • Maybe test suite is not the right name, rules suite perhaps?
  • Predrag

    • I had a similar idea
  • Paolo

    • How we reach that point we shouldn’t care how the client should implement
  • Jason

    • Different mechanisms for different vendors - how do we think about unifying this
    • There are different tradeoffs in each
    • Some have conventions, some don’t
    • And there are a breadth of solutions
    • There is a little more than merging the schemas
  • Paolo

    • The GraphQL spec doesn’t define metadata, right?
    • Or am I wrong on this?
  • Jason

    • There are directives that are used/interpreted by infrastructure
    • This extra metadata is interpreted
    • Most vendor capture how do we merge x and y through this
    • But there is a range of solutions out there
  • Benjie

    • Identity is a problem that been out there for a while
    • You are correct that many solutions have to solve in one way or another
    • What I would like to see is a grid comparison of a shared topic
    • This would be valuable to see different approaches written down
    • We can make decisions based on this to find out if solutions follow similar patterns or not
    • Would love to see for each of these different topics:
      • Identifying what the schemas
      • How to retrieve things
    • We are not at the point of specifying how we act on this data
    • But the indication of the identity is very valuable
    • If someone can take on the work of comparing these solutions would be valuable
  • Predrag

    • If we are open to unorthodox solutions
    • We have no type merging at Kenshu
    • It makes the composition logic simpler
    • E.g. User has a library of books in another service
    • Allow you define macros
      • Pretend that some edge exists that is actually composition of edges
    • Before execution, these get expanded
    • Every type has a clear owner/ID while the user writing this query gets to pretend the shape is something they control
  • John

    • Clarifying q
      • Each type gets a specifiedby scalar saying where it comes from
  • Predrag

    • For any type in the final composed schema you know explicitly where it came from
    • Any edge can either be within the schema or can be crossing a schema boundary and can be injected at composition time
    • Lend to client-centric execution
    • Execute components and then connect the data sets back together
  • John

    • So you saying there is a UUID that they use
    • If they share they get composed at the same type
  • Predrag - I missed a bit of this - can you fill out please? Thank you! _ Example _ Slightly outdated code example repo from a meetup 3 years ago, but still demonstrates the overall idea:
    https://predr.ag/talks/#graphql-as-a-high-performance-cross-database-query-language

        [https://github.com/obi1kenobi/graphql-compiler-cross-db-example](https://github.com/obi1kenobi/graphql-compiler-cross-db-example)
    
        * The system’s query syntax is a bit different than what you’re used to:  [https://github.com/obi1kenobi/graphql-compiler-cross-db-example/blob/master/01_intro_demo.ipynb](https://github.com/obi1kenobi/graphql-compiler-cross-db-example/blob/master/01_intro_demo.ipynb)
        * Schema reshaping + cross-datasource execution: [https://github.com/obi1kenobi/graphql-compiler-cross-db-example/blob/master/02_macro_edges.ipynb](https://github.com/obi1kenobi/graphql-compiler-cross-db-example/blob/master/02_macro_edges.ipynb)
        * In this case, the schema reshaping is through _macro edges,_ edges that are added for user convenience so the user doesn’t have to navigate the actual complexity of the underlying schema. What the user perceives as a macro edge can actually be hiding multiple edge traversals + predicates under the hood, which the query-writing user doesn’t care about.
        * We can define other kinds of macros as well:
            * macro fields: non-edge fields (scalars and their lists) whose values are defined as a transformation across other fields, potentially traversing edges. For example: a “booksOwnedCount: Int!” macro field on “User” can be a macro field expanding to “traverse the User.ownedBooks edge and count how many vertices are on the other side”.
            * macro types: akin to a view in SQL, a type that is defined as a composition of other types
    
  • Jason

    • How do know which IDs to send
    • And which correspond to which metadata
  • Predrag

    • Fetch one field and feed into filter onto the other side
  • Jason

    • Who is responsible for owning this data
  • Predrag

    • We are lucky with having a single database that contains the metadata on how data sources interconnect
  • Benjie

    • Action item for Predrag to add example in the glossary for the Kensho example
    • Would like everyone to have a look at the PR for what schema composability is
      • If we don’t have agreement, figure out why we don’t have agreement
    • Look at existing solutions, find the standard pieces and specifying those bits
    • Do not want to limit innovation e.g Predrag’s solution
    • If there’s nothing else, would be good to call this meeting to a close
    • Next meeting on 8th september

Action items

  • Everyone
    • Everyone to have a look at the PR for what schema composability is
  • Predrag
    • Predrag to add an example in the glossary/relevant doc for the Kensho example mentioned
    • A doc on each of existing solutions to find commonalities and standard metadata
      • To be put in another folder like RFCs folder with README
      • Or suitable alternative
  • Yaacov
    • To look into defining a test or rules suite reference implementation for defining what is and what isn’t composable
  • Laurin
    • To split existing document up into 3 parts:
      • Glossary.md (generic terms)
      • VendorSolutions.md (the solutions that are out there)
      • TermsToAvoid.md (branded/misleading/ambiguous terms)