New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple operations in a single request #795

Open
tkellen opened this Issue Jun 29, 2015 · 126 comments

Comments

Projects
None yet
@tkellen
Member

tkellen commented Jun 29, 2015

It has been widely requested that JSON-API support creating/editing/updating/deleting multiple records in a single request (#202, #205, #753, #536). We have attempted to paint this bikeshed repeatedly over the years. Providing an "official" solution, or solutions, to this problem is one of the primary goals for v1.1 of this specification.

There are several requirements for a solution:

  • It should allow mixed actions in a single request (e.g. creating + updating).
  • It should work for multiple types of resources (e.g. articles + tags) that would otherwise be accessed at different endpoints.
  • Every request should be transactional (i.e. all operations should succeed or fail together).
  • Additive to the base spec (probably as an extension).

Here are a couple use-cases that test these requirements:

  • Create two articles, add one existing tag to both, create a new tag and add to both.
  • Change the title of an article and update its tags: add one existing tag, create another, and remove another (without replacing the whole tag set).

We would like the community to propose ideas for how to solve these use cases (or chime in to support solutions provided in the responses to follow).

Assume the following data is present

GET /articles

{
  "data": [{
    "id": "article-1",
    "type": "articles",
    "attributes": {
      "title": "the first article",
      "content": "some content for an article",
    },
    "relationships": {
      "tags": {
        "data": [
          { "id": "tag-1", "type": "tags" }
        ]
      }
    }
  }]
}
GET /tags

{
  "data": [{
    "id": "tag-1",
    "type": "tags"
    "attributes": {
      "name": "one"
    }
  }, {
    "id": "tag-2",
    "type": "tags"
    "attributes": {
      "name": "two"
    }
  }]
}
@dgeb

This comment has been minimized.

Member

dgeb commented Aug 19, 2015

Potential Solution: Operations Extension

One solution to this problem is to represent each operation as an entity in a request document. Operations could be transmitted in an array, since ordering is often significant. Similarly, responses to each operation could be returned in an array that parallels the operations in a request.

This solution is very similar to the experimental JSON Patch extension with the exception that it is additive to the base spec.

Operations could include the following members:

  • path - the path to the applicable resource, if different from the request path
  • action - add / remove / update actions that map to HTTP methods
  • data - the data normally sent as top-level data

Operations would be sent in a top-level operations array.

Responses could also be returned in a top-level operations array, in which the status member could be added to each operation.

Requests would be sent with the PATCH method together with the TBD extension requirement.

It's important to mention that a server would have complete control over which operations could be performed together at any endpoint.

Solutions for use cases

Create two articles, add one existing tag to both, create a new tag and add to both

Request:

PATCH /articles

{
  "operations": [{
    "path": "/tags",
    "action": "add",
    "data": {
      "type": "tags",
      "attributes": {
        "name": "three"
      }
    }
  }, {
    "action": "add",
    "data": {
      "type": "articles",
      "attributes": {
        "title": "A second article",
        "content": "Lorem ipsum."
      },
      "relationships": {
        "tags": {
          "data": [
            { "id": "tag-1", "type": "tags" },
            { "pointer": "/operations/0/data" }
          ]
        }
      }
    }
  }, {
    "action": "add",
    "data": {
      "type": "articles",
      "attributes": {
        "title": "A third article",
        "content": "Lorem ipsum."
      },
      "relationships": {
        "tags": {
          "data": [
            { "id": "tag-1", "type": "tags" },
            { "pointer": "/operations/0/data" }
          ]
        }
      }
    }
  }]
}

Response:

{
  "operations": [{
    "status": "201",
    "data": {
      "id": "tag-3",
      "type": "tags",
      "attributes": {
        "name": "three"
      }
    }
  }, {
    "status": "201",
    "data": {
      "id": "articles-2",
      "type": "articles",
      "attributes": {
        "title": "A second article",
        "content": "Lorem ipsum."
      },
      "relationships": {
        "tags": {
          "data": [
            { "id": "tag-1", "type": "tags" },
            { "id": "tag-3", "type": "tags" }
          ]
        }
      }
    }
  }, {
    "status": "201",
    "data": {
      "id": "articles-3",
      "type": "articles",
      "attributes": {
        "title": "A third article",
        "content": "Lorem ipsum."
      },
      "relationships": {
        "tags": {
          "data": [
            { "id": "tag-1", "type": "tags" },
            { "id": "tag-3", "type": "tags" }
          ]
        }
      }
    }
  }]
}

Change the title of an article and update its tags: add one existing tag, create another, and remove another (without replacing the whole tag set).

Request:

PATCH /articles/1

{
  "operations": [{
    "path": "/tags",
    "action": "add",
    "data": {
      "type": "tags",
      "attributes": {
        "name": "three"
      }
    }
  }, {
    "action": "update",
    "data": {
      "type": "articles",
      "id": "1",
      "attributes": { "title": "Something more exciting!" }
    }
  }, {
    "path": "/articles/1/relationships/tags",
    "action": "add",
    "data": [
      { "pointer": "/operations/0/data" }
    ]
  }, {
    "path": "/articles/1/relationships/tags",
    "action": "remove",
    "data": [
      { "id": "tag-1", "type": "tags" }
    ]
  }, {
    "path": "/articles/1/relationships/tags",
    "action": "add",
    "data": [
      { "id": "tag-3", "type": "tags" }
    ]
  }]
}

Response:

{
  "operations": [{
    "status": "201",
    "data": {
      "id": "tag-3",
      "type": "tags",
      "attributes": {
        "name": "three"
      }
    }
  }, {
    "status": "204"
  }, {
    "status": "204"
  }, {
    "status": "204"
  }, {
    "status": "204"
  }]
}

@json-api json-api unlocked this conversation Aug 19, 2015

@dgeb dgeb changed the title from Embedding / Sideposting to Support for multiple operations in a single request Aug 19, 2015

@tkellen

This comment has been minimized.

Member

tkellen commented Aug 19, 2015

Leaving notes here for posterity, @dgeb and I discussed the possibility that we should consider supporting a response flag in operations with two allowed values, either full or status. This would control how detailed the response from the server will be for each operation. This would cover the case where the client does not care about the details of the intermediate representations created within a transaction.

We might also need to support a fetch action whose path can be a url or a pointer to one of the operations. This would allow a client to request a full representation of a json-api resource they care about (as it appears after all of the preceding operations were performed).

As an aside, this entire proposal feels like gross RPC, but there really doesn't seem to be a path forward to enable the complex use-cases we are trying to support without this type of extension.

@jakerobers

This comment has been minimized.

jakerobers commented Aug 19, 2015

This will probably get messy on the client side when you have relationships multiple layers deep. Doing multiple requests keeps data in a flatter structure, and will keep things simpler, in my opinion.

@tkellen

This comment has been minimized.

Member

tkellen commented Aug 19, 2015

This will probably get messy on the client side when you have relationships multiple layers deep. Doing multiple requests keeps data in a flatter structure, and will keep things simpler, in my opinion.

You can totally do that. We're discussing an optional/additive extension to the base specification that has been widely requested (see the issues referenced in the OP). If you don't want to use it, this issue doesn't apply to you.

@mfpiccolo

This comment has been minimized.

mfpiccolo commented Aug 19, 2015

I agree with @tkellen. Yes, using multiple requests keeps things simpler, however sometimes it doesn't make sense to do multiple requests especially if they are to be treated as a single transaction. It would be very difficult to manage a block of separate requests as a single transaction, rollback changes and return error states for the objects.

I am currently using included to handle these multi-operational actions which has its down sides and if I am rolling my own it means others are too. That is a good indication that we need a spec.

@shicholas

This comment has been minimized.

shicholas commented Aug 19, 2015

I really like the proposed solution above. I am excited to see a standard arise out of this.

regarding this:

I discussed the possibility that we should consider supporting a response flag in operations with two allowed values, either full or status.

I like the idea of building an app that returns just the status if all my operations were update/delete, and full if any of the operations was an add. So I would like to see a response flag, perhaps as metadata?

and this:

We might also need to support a fetch action whose path can be a url or a pointer to one of the operations. This would allow a client to request a full representation of a json-api resource they care about (as it appears after all of the preceding operations were performed).

Does this mean that the server will return something other than the operations array? If so, I don't think it's a good idea b/c these resources can be included in that array already.

@daliwali

This comment has been minimized.

Contributor

daliwali commented Aug 19, 2015

Not sure how I feel about this from an implementer's perspective. It adds a lot of complexity just to save on HTTP requests. That said, I'm glad it's being discussed as an optional extension.

@dgeb

This comment has been minimized.

Member

dgeb commented Aug 19, 2015

It adds a lot of complexity just to save on HTTP requests.

That is one primary goal. Another is to provide a means by which multiple operations can be performed in a single transaction.

And still another goal particular to the operations extension is to provide a format compatible with streams (e.g. web sockets).

@tkellen

This comment has been minimized.

Member

tkellen commented Aug 19, 2015

We might also need to support a fetch action whose path can be a url or a pointer to one of the operations. This would allow a client to request a full representation of a json-api resource they care about (as it appears after all of the preceding operations were performed).
Does this mean that the server will return something other than the operations array? If so, I don't think it's a good idea b/c these resources can be included in that array already.

Does this mean that the server will return something other than the operations array? If so, I don't think it's a good idea b/c these resources can be included in that array already.

No, it means you might have a request like this:

{
  "operations": [{
    "path": "/articles/1/relationships/tags",
    "action": "remove",
    "data": [
      { "id": "tag-1", "type": "tags" }
    ]
  }, {
    "path": "/articles/1/relationships/tags",
    "action": "add",
    "data": [
      { "id": "tag-3", "type": "tags" }
    ]
  }, {
    "path": "/articles/1",
    "action": "fetch"
  }]
}

...with a response like:

{
  "operations": [{
    "status": "204"
  }, {
    "status": "204"
  }, {
    "status": "200",
    "data": {
      "id": "1",
      "type": "articles",
      "attributes": {
        "title": "the first article",
        "content": "some content for an article",
      },
      "relationships": {
        "tags": {
          "data": [
            { "id": "tag-3", "type": "tags" },
          ]
        }
      }
    }
  }]
}
@shicholas

This comment has been minimized.

shicholas commented Aug 19, 2015

oh cool, I could see the utility in that. Thanks for the example.

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Aug 20, 2015

As an aside, this entire proposal feels like gross RPC, but there really doesn't seem to be a path forward to enable the complex use-cases we are trying to support without this type of extension.

That sounds like a fair diagnosis to me. That is, I agree that there are some complex cases that require transactional and imperative/RPC-ish semantics. Therefore, we'll need something reasonably like JSON PATCH, and @dgeb's proposal here seems cleaner than the current JSON PATCH extension.

However, I wonder if we could cover 80% of the embedding/sideposting use cases with a more declarative, higher-level format and, if so, whether that would make sense.

Sideposting Proposal

Overview

The higher-level other option I was thinking about is an extension, negotiated with the TBD extension mechanism, that would be something very similar to what I proposed with an "embedded" member in #536. The only differences are that it would:

  • be merged with the bulk extension
  • be extended to work with PATCH requests too
  • include more normative guidance on which types of relationship graphs must be supported, and a way to signal other graphs types that happen to be supported
  • offer a standard way to determine which embedded resource was assigned which server id (if server ids are in use).

Use Cases

Here's how it would work with the two use cases given above.

Creating two articles and adding existing and new tags to each:

Request:

POST /articles

{
  "data": [{
    "type": "articles",
    "attributes": {
      "title": "A second article",
      "content": "Lorem ipsum."
    },
    "relationships": {
      "tags": {
        "data": [
          { "id": "tag-1", "type": "tags" },
          { "pointer": "/embedded/0" }
        ]
      }
    }
  }, {
    "type": "articles",
    "attributes": {
      "title": "A third article",
      "content": "Lorem ipsum."
    },
    "relationships": {
      "tags": {
        "data": [
          { "id": "tag-1", "type": "tags" },
          { "pointer": "/embedded/0" }
        ]
      }
    }
  }],
  "embedded": [{
    "type": "tags",
    "attributes": {
      "name": "three"
    }
  }]
}

This is as I proposed in #536, except that I've taken advantage of the idea in the bulk extension to make "data" an array. The response would look exactly like the request, except that each resource object would now have a server-assigned id too.

Change the title of an article and update its tags: add one existing tag, create another, and remove another (without replacing the whole tag set).

This use case would need to be handled by operations, if it's to be done in one request. Otherwise, it would take 3 requests (which might be ok). One request would update the article's title; one would create the new tag and add it with the existing one; and one would remove the other tag. The interesting request is the one that creates and adds the new tag simultaneously, which would look (as you'd expect) like this:

POST /articles/1/relationships/tags

{
  "data": [
    { "pointer": "/embedded/0" },
    { "type": "tags", "id": "tag-2" }
  ],
  "embedded": [
    { "type": "tags", "attributes": { "name": "tag 3" } }
  ]
}

Specification Details

First, as I mentioned, this proposed extension would also include a way for the client to determine which embedded resources were assigned which ids (if server-side ids are in use). That would work like so:

In the request:

// …
"embedded": [
  { "type": "tags", "attributes": { … }, "temp-id": "1" }
]

Then, in the response:

// …
"temp-ids": {
  "1": "de305d54-75b4-431b-adb2-eb6b9e546014"
}

The "temp-id" key would be totally optional on the client's end, but the server would be required to send back the "temp-ids" mapping with any "temp-id"s it received. Alternatively, "temp-id" could be made mandatory and used in place of the JSON Pointers; I was in favor of that in #536, but am indifferent at this point. Also, if "temp-id"s are used, they might alternatively live/be returned in "meta", rather than at the top-level, depending on whether the extension system ultimately allows an extension to "claim" "meta" members.

About relationship graphs: the extension would be required to support graphs in which the only links between the resources in the request document are relationships from the primary data to embedded resources. (That is, the embedded resources don't link to one another, and the primary data's resource objects, if there's more than one, don't link to one another. But embedded resources can link to other, pre-existing resources.) This could be extended to say that any links are valid so long as the resulting graph is acyclic, to support the recursively embedded resources that @tkellen asked about on the original embedded resources proposal, in addition to interlinking between primary data resources, etc.

As far as supporting other graphs goes, the extension would have feature flags (again, mechanism tbd) indicating the types of graphs it supports. The structure of those flags would be defined over time as we collect more real-world use cases. A request might have a way to specify which type of graph it's sending, in order for the server to use more efficient processing.

Analysis

Pros, as I see them:

  • At least for some datastores, the higher-level syntax might be more performant and easier to implement transactionally than the raw "operations" format would be. In particular, I worry with the raw operations format that, unless the server analyzes the operations upfront to look for optimizations, it will simply apply them serially and in order, and that could be slow—especially, god forbid, if we allow some resource types to live across the network, as discussed here. By contrast, with the higher-level syntax, if we're just talking about must-be-supported graphs described above, a fast implementation is very simple: the server creates all the embedded resources at once/in any order and then creates/updates the primary data (at once/in any order) and sets their links to the newly-created embedded resources.

    Or, if we expand the class of must-be-supported graphs to include all acyclic ones, the server treats the primary and embedded resources as one graph and creates them in the order returned by the well-known algorithms for topological sorting. (This also takes the burden of ordering the operations, which could be non-trivial, off the client, so long as it can guarantee that the graph is acyclic; that seems like a very good thing to me. For performance reasons, though, we could always add back the ability for the client to prespecify an order, if it knows a workable one in advance.)

    In the sense that it has a clear processing model, this higher-level syntax is analogous to @dgeb's embedded resources proposal, though it supports a superset of that proposal's graphs and more closely mirrors the "included" feature by using an "embedded" array rather than nesting objects.

  • I think side-posting is easier to learn/use than operations, because the latter requires adopting a different mental model than the one used in the rest of the spec.

  • This syntax seems to cover almost all the cases we've been asked for directly: creating a post with its author, a credit card with its billing address, a conversation with its first message, an invoice with its line items, etc.

  • There's some chance that people would use the operations approach sub-optimally, e.g. by making all their operations to the root endpoint, which would give less-good HTTP caching than the side posting approach, in which the PATCH always goes through the appropriate resource/collection URI and therefore invalidates intermediate caches.

The biggest con I see to this higher-level syntax is that, because we'll probably still need an operations extension too, it's duplicative. However, if a good chunk of APIs could get away without implementing operations, and the high level syntax really has the advantages listed above (and doesn't have any unforeseen problems), then having it as an option might make sense anyway. After all, we shouldn't take the "it's duplicative" argument to its extreme, as that would suggest removing most of the base spec, since all one really needs is operations.

Proposal 2: Inline Operations

I think that the ideal request for the second use case would look something like the below:

PATCH /articles/1

{
  "data": {
    "attributes": { 
      "title": "New Title" 
    },
    "relationships": {
      "tags": {
        "operations": [
          { "type": "tags", "id": "tag-3", "op": "add" }, 
          { "type": "tags", "id": "tag-1", "op": "remove" },
          { "pointer": "/embedded/0", "op": "add" }
        ]
      }
    }
  }, 
  "embedded": [{
    "type": "tags",
    "attributes": {
      "name": "three"
    }
  }]
}

This only uses one request, but it feels less RPC-ish than the straight up operations approach, and it builds on the side posting syntax proposed above.

I think of this as the "inline operations" approach because it gets rid of "path" entirely in each operation and thereby tries to blend the operation ideas in with the existing JSON API spec. My hope is that an approach like this could make operations feel a bit more natural, but this is a very new idea, so I'm not sure if it'll actually work. I'm curious what y'all think!

@shicholas

This comment has been minimized.

shicholas commented Aug 20, 2015

I like this suggestion a lot too. What happens if there are two levels of resources being created? e.g. "Creating two articles and adding new tag and tag category" That is a contrived example, and perhaps the spec should dissuade requests like that?

@tkellen

This comment has been minimized.

Member

tkellen commented Aug 20, 2015

The original draft of this post also included this use-case, but we didn't fill it out for time reasons:

  • Create this hierarchical data structure:
.
└── nodeA
    ├── aChild
    └── nodeB
        ├── bChild
        └── nodeC
            └── cChild
  • Transform the above structure to this:
├── nodeA
│   └── aChild
├── nodeB
│   └── bChild
├── nodeC
│   └── cChild
└── rootChild

GET /nodes

This actually starts empty, but the representation below matches this

.
└── nodeA
    ├── aChild
    └── nodeB
        ├── bChild
        └── nodeC
            └── cChild
{
  "data": [{
    "id": "nodeA",
    "type": "nodes",
    "relationships": {
      "parent": {
        "data":null
      },
      "children": {
        "data": [{
          "id": "nodeB",
          "type": "nodes"
        }, {
          "id": "nodeC",
          "type": "nodes"
        }]
      }
    }
  }, {
    "id": "nodeB",
    "type": "nodes",
    "relationships": {
      "parent": {
        "data": {
          "id": "nodeA",
          "type": "nodes"
        }
      },
      "children": {
        "data": []
      }
    }
  }, {
    "id": "nodeC",
    "type": "nodes",
    "relationships": {
      "parent": {
        "data": {
          "id": "nodeA",
          "type": "nodes"
        }
      },
      "children": {
        "data": [{
          "id": "nodeD",
          "type": "nodes"
        }, {
          "id": "nodeE",
          "type": "nodes"
        }]
      }
    }
  }, {
    "id": "nodeD",
    "type": "nodes",
    "relationships": {
      "parent": {
        "data": {
          "id": "nodeC",
          "type": "nodes"
        }
      },
      "children": {
        "data": []
      }
    }
  }, {
    "id": "nodeE",
    "type": "nodes",
    "relationships": {
      "parent": {
        "data": {
          "id": "nodeD",
          "type": "nodes"
        }
      },
      "children": {
        "data": [{
          "id": "nodeF",
          "type": "nodes"
        }]
      }
    }
  }, {
    "id": "nodeF",
    "type": "nodes",
    "relationships": {
      "parent": {
        "data": {
          "id": "nodeE",
          "type": "nodes"
        }
      },
      "children": {
        "data": []
      }
    }
  }]
}

If someone wants to take a stab at representing that in both proposed solutions, that would be great.

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Aug 21, 2015

@shicholas If I understand your question correctly, that would look like this:

POST /articles

{
  "data": [{ 
    "type": "articles", 
    "attributes": { "title": "New Article" },
    "relationships": {
      "tags": {
        "data": [ { "pointer": "/embedded/0" } ]
      }
    }
  }, {
    // second article would be just like the first one
  }],
  "embedded": [{
    "type": "tags", 
    "attributes": {
      "name": "new tag!"
    },
    "relationships": {
      "category": {
        "data": { "pointer": "/embedded/1" }
      }
    }
  }, {
    "type": "category",
    "attributes": { "name": "whatever" }
  }]
}

Per the earlier discussion, though, this example would only be supported if the extension signals it can create any acyclic graph or we make that a base requirement.

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Aug 21, 2015

@tkellen Re the other use cases:

Create this hierarchical data structure:

└── nodeA
    ├── aChild
    └── nodeB
        ├── bChild
        └── nodeC
            └── cChild

That would be done very similarly to my above comment, with a single POST to /nodes.

Transforming that structure, once created, though, would require two requests. The first would move nodeB and nodeC to the root level, pulling their descendants along with them:

PATCH /nodes

{
  "data": [{
    "type": "nodes", 
    "id": "nodeB", 
    "relationships": {
      "parent": { "data": null }
    }
  }, {
    "type": "nodes",
    "id": "nodeC",
    "relationships": {
       "parent": { "data": null }
    }
  }]
}

The second request would create the new rootChild node with a simple POST.

With inline operations, this might be able to be consolidated into one request, like so:

PATCH /nodes

{
  "data": [ // same as above ],
  "operations": [{ 
    "op": "create", 
    "data":  { 
      // new node resource object with parent null 
    }
  }]
}

However, the fact that the this complex a transformation takes only one or two requests feels like a lucky anomaly. It's enabled by /nodes happening to return every node (not just the root level ones). Moreover, we happen to be able to get by in this case without specifying the order in which the operations in data (or, in the consolidated version, data and operations) are executed.

For complex transformations like this to work in the general case, though, we'd probably have to specify that the updates in data are executed in order, and that the operations are run after the data changes are applied...which brings us back basically to the same processing model as operations. All of which is to say that there'd still be some cases in which just using operations makes sense, even if we have a higher-level alternative.

@shicholas

This comment has been minimized.

shicholas commented Aug 22, 2015

Thank you @ethanresnick for answering my question.

FWIW, I like your proposal better than the operations approach because I feel it adequately addresses any POST use-case I intend on using with minimal changes to the base spec. I like how it keeps the data key, which makes what url I should send the requset to clear and makes what I feel the proper response should be (correct me if I'm wrong but I feel it would be the POST request described in the spec for the resource(s) described in the data key).

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Aug 22, 2015

I like how it keeps the data key, which makes what url I should send the requset to clear and makes what I feel the proper response should be (correct me if I'm wrong but I feel it would be the POST request described in the spec for the resource(s) described in the data key)

Thanks Nick. I was imagining the response would be the same as the current POST response, except that the newly-created resources from "embedded" would also be returned in the response. This allows the client to see the server-assigned id each was given, as discussed earlier, and any other server-set attributes.

@jimbojetlag

This comment has been minimized.

jimbojetlag commented Feb 10, 2017

image

Still no decision?

@morenoh149

This comment has been minimized.

morenoh149 commented Feb 10, 2017

it's a tough cookie

@travisghansen

This comment has been minimized.

travisghansen commented Mar 17, 2017

+1 just following along!

@valscion

This comment has been minimized.

Contributor

valscion commented Mar 18, 2017

@travisghansen please use GitHub reactions to +1 or the subscribe button to get notifcations about updates to this issue. +1 comments usually don't bring additonal value to the conversation and cause unnecessary notifications to be sent out to multiple users.

Also great to hear that you, too, want to get this long standing use case resolved later! ☺️

@mockdeep

This comment has been minimized.

mockdeep commented May 7, 2017

Sorry if some of what I'm saying has already been discussed here, but I wonder if a couple of simple modifications might help in a number of cases. For example, allowing the 'data' key to be an array seems like it could solve for a lot of use cases, in particular when you want to create a number of the same type of records. Not sure if it would make sense to allow mixing types in that situation, but in most of what I run into I actually just want to use the same verb on the same types of records, e.g.: marking a bunch of tasks completed.

I can see a different use case where you want to batch operations from a performance and latency perspective. To me this actually seems like a higher level sort of resource, where you simply post to a generic batch operations endpoint and let the server sort it out. From the client perspective, it might be a CREATE request to the /batch endpoint where the 'data' contains a nested array of JSONAPI formatted requests, verbs and all. 'data' itself could point to an array, or it could be an object with a nested array to allow for additional meta stuff on the request, such as "data": { "operations": [one, two], "atomic": true, "strict_order": true }. This latter actually doesn't seem like it would require any modification to JSONAPI, aside from maybe documenting it as a recommended approach. It would also allow for a more asynchronous process when the request might take longer to process, where the server responds with a record with a status that represents the set of operations and how far along they are.

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Jun 30, 2017

Fwiw, here's how I solved this within the constraints of JSON:API 1.... basically, I took the solution I proposed upthread (#795 (comment)) and jammed all the changes into the various meta bags. So:

  • there's a document level meta.included key that holds an array of resource objects to create along with the primary resource object. This array can also hold other resources that would have gone in the document's data key were that key allowed to hold an array.

  • there's a resource-level meta.relationships key, which holds objects with json-pointers to the included resources that should be linked with the newly-created ones.

So, creating two articles with related tags looks like this:

POST /articles

{
  "data": { 
    "type": "articles", 
    "attributes": { "title": "First Article" },
    "meta": {
      "relationships": {
        "tags": {
          /* relate it to the tag "new tag!" on creation */
          "data": [ 
            { "pointer": "/meta/included/0" }
          ] 
        }
      }
    }
  },
  "meta": {
    "included": [{
      "type": "tags", 
      "attributes": { "name": "new tag!" },
      "meta": {
        "relationships": { 
          /* relate this tag to the "whatever" tag below */
          "category": {
            "data": { "pointer": "/meta/included/1" }
          }
        }
      }
    }, {
      "type": "category",
      "attributes": { "name": "whatever" }
    }, { 
      /* second article's resource object goes here. 
         it can have relationships to other resources in its payload in the same way  */
    }]
  }
}

Servers still have to decide what linkage graphs to allow (see the 'Analysis' section of the comment I linked above), but that's not too hard and it's sorta inevitable—unless you go for an embedding approach, which trades off reusing a sideposted resource in multiple relationships for a guarantee that the graph is a tree. Implementing such an approach with meta would be easy too, ig, but I think the approach outlined above strikes the right balance between complexity (it's not full-on operations; you can use it today) and meeting most peoples' needs.

Finally, fwiw, if an extensions approach like profile extensions is implemented (#957), it could easily be used to specify that meta.relationships and meta.included are defined according to some extension's spec (identified by url), rather than their being arbitrary, api-specific data. That would look like so (at least based on the syntax in #957):

{
   // map the customly-named keys to extensions with a standardized definition!
   // Note that both names must map to the same extension, as splitting this operation up
   // over multiple extensions would violate some of the constraints on profile extensions
   "aliases": {
     "included": "http://example.com/some-url-for-extension-specifying-sideposting",
     "relationships": "http://example.com/some-url-for-extension-specifying-sideposting"
    },
    
    // rest of the document is identical to what's shown in the no-extensions example
}
@brainwipe

This comment has been minimized.

brainwipe commented Jul 6, 2017

How do you handle errors? Is there a transactional boundary across all the things you are creating?

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Jul 10, 2017

@brainwipe Thats a really good question. Currently, how to handle errors produced by values in meta (whether or not those values are linked to a profile extension) is not specified. Obviously, we have to fix that, but I think it's possible to specify the error handling in a way that would allow the server to fail the whole request if any of the extensions it supports contain invalid data/fail. That would give us transaction-like semantics on servers that support the hypothetical extension in my example. That is to say, basically, that I think the scheme I described above can be made to work.

Of course, some servers might not support all the relevant extensions, and those severs would simply ignore the data provided by the extensions they don't understand. This would result in the request succeeding, but with only some of the resources/relationships being created. Since that is probably unacceptable, a client would either have to know in advance whether the server it's working with supports the relevant extensions (in today's world, most clients are hard coded to interact with a single sever, so this knowledge would be easy to build in) or the client could use content negotiation to ask the server in advance whether it supports the relevant extensions.

@remmeier

This comment has been minimized.

Contributor

remmeier commented Jul 13, 2017

in crnk.io and ngrx-json-api we now implemented the multiple operation support with jsonpatch. so far that works very well. The implementation was quite simple. There was not a need for a new standard. Error handling works well as each request is able to provide its own response (included status code). It can also handle more complex object structures. The objects in question do not necessarily need to be related. The one thing still missing and what would need some specification is how to deal with relationships and server-created IDs when doing POSTs (but usually we do not make use of that in the first place).

So from my point of view while it seems easier to attach object somewhere within the relationships or meta data. The jsonpatch/operation approach stays closer to the semantic of jsonapi, making it easier to use and implement with all the feature set of json api (IMHO...).

(not to forget that the approach would alsl allow the invocation of non-jsonapi services)

@brainwipe

This comment has been minimized.

brainwipe commented Jul 14, 2017

Just so that I understand...

Obviously, we have to fix that, but I think it's possible to specify the error handling in a way that would allow the server to fail the whole request if any of the extensions it supports contain invalid data/fail.

So we're saying that the single POST/PUT/DELETE is a transaction? One fails, they all fail?

As for server support, I completely agree that most clients are coded against a single server. For those that rely on discovery, you usually have a "discovery document" on the root of the API (in json.api, that would be a document with meta info only) and that document would be perfect to put the extensions in.

As an aside, don't dwell on this bit below...
I still remain unconvinced that putting multiple operations in a single POST is a good idea. HTTP/REST and HTTP2 more so are designed around the idea of many small, atomic operations. By adding more than one operation in a POST you are breaking that atomicity.

From a domain design point of view, you're giving the client the ability to understand more about the inner workings of the domain and by doing so you're hard coupling the client to the domain. Once you hard couple the domain (not the API, the domain objects themselves) to the client then it becomes increasingly more difficult to change the API. There has to be some coupling but until now json:api has been about resources and their links between them in a generic fashion, but adding in composites you're now saying that the client needs to understand what can be built with what and in what order.

I'm just adding this in here for posterity because I can see some domains being leaked through the multi-operations and the API builder should be aware of that.

@kellysutton

This comment has been minimized.

kellysutton commented Jul 14, 2017

Just chiming in here, in the spirit of mapping json:api to HTTP semantics, I believe the transaction boundary will always need to be at the request level. Put another way: if one POST/PUT/DELETE action fails, they should all fail. (There is no standard "Partial Success" status code.)

I have worked with companies and clients who don't like the idea of maintaining the transaction boundary on the client side, i.e. having the client issue individual requests and then handling rollback in case one fails.

In situations like that, we usually go with jsonpatch in the meta.

@lolmaus

This comment has been minimized.

lolmaus commented Jul 14, 2017

I have worked with companies and clients who don't like the idea of maintaining the transaction boundary on the client side, i.e. having the client issue individual requests and then handling rollback in case one fails.

Yeah, what if a rollback fails too? :trollface:

@kellysutton

This comment has been minimized.

kellysutton commented Jul 14, 2017

There are many failure scenarios, that being one of them. Another: What if the client loses a path to the server right before it issues a rollback command?

@mltsy

This comment has been minimized.

mltsy commented Nov 8, 2017

I'm also curious about @brainwipe's question:

Does anyone have a real world example of something they need to do that requires this?

I came here because I wanted to be able to update a whole list of domain_names in a single request, and that was for optimization. Thinking about your comment on that, I'm considering whether it would make more sense to solve that on the server side. The problem is, I don't know how to solve it on the server side without replacing my whole framework or increasing server capacity (both of which increase costs). So my possible alternative is to create a non-JSON:API endpoint that allows me to update all of them at once - which isn't a terrible alternative for those of us who just need to optimize some specific use-case... in the interest of keeping JSON:API resourceful. This is similar to optimizing database access from a Rails app. Most of the time you can get by just using ActiveRecord... sometimes though, when you need to sort by the sum of an associated collection's fields... you might just have to use another way of accessing the database if you want it to go fast.

I feel like there are probably other use cases here aside from optimization though. Maybe all of them are more for convenience/simplicity than necessity - but those are important too. (The question is, what are they? And should they be addressed by JSON:API?)

@richmolj

This comment has been minimized.

Contributor

richmolj commented Nov 9, 2017

@mlts I think you've identified the central question:

I feel like there are probably other use cases here aside from optimization though. Maybe all of them are more for convenience/simplicity than necessity - but those are important too. (The question is, what are they? And should they be addressed by JSON:API?)

Here is the sample application from my jsonapi_suite tutorial:

This is a fairly simple application but illustrates a variety of use cases. The form on the right illustrates the need for sideposting. We want to submit the Employee, Positions, and Department all in one request - this way things run in a transaction, and if the transaction fails validation errors are returned.

(one final requirement not shown is the ability to disassociate an entity versus destroying it - think removing tags from the employee...we'd want to remove the tag but not destroy it)

The sideposting alternatives I've heard either A) have potential to leave orphaned records in the DB B) require the programmer to create a custom aggregate. I dislike both these options because sideposting can be encapsulated by libraries (in my case jsorm). This means developer-facing code is simple and out-of-the-box:

// save employee, create a position and associate to existing department
Department.find(1).then({data}) => {
  let position = new Position({ title: 'Safety Inspector', department: data });
  employee.save({ with: { positions: 'department' } });
});

See here for the full app.

These common use cases are what I'm looking for - everything else is just gravy. FWIW, for your use case I have a dummy entry endpoint and sidepost domain_names to it.

@e0ipso

This comment has been minimized.

e0ipso commented Nov 9, 2017

This is a solution we are using for this: https://github.com/e0ipso/subrequests (nodejs) and https://www.drupal.org/project/subrequests (Drupal).

This is an article explaining how it works: https://www.lullabot.com/articles/incredible-decoupled-performance-with-subrequests

I hope it helps.

@mltsy

This comment has been minimized.

mltsy commented Nov 9, 2017

So @richmolj - to compare your use case there against @brainwipe's argument that this allows/encourages domain knowledge to leak into the client: I think it's clear that some domain knowledge-rules are managed in the client here, but only ones that the client cares about, right? In this kind of scenario, the client is not simply a UI for the API, it's an application built on top of a given data source, provided by the API. So while the API has its own domain model and rules, the client application itself may also have some of its own domain modeling and rules that it would like to enforce (in this case, that employees aren't created without positions attached - so the app can always assume that an employee has positions, and therefore not provide an interface for the case of an employee that doesn't have a position).

In this context, the API is playing the role of a data source and it's true that a data-source doesn't have to provide features like transactions or multiple simultaneous updates/actions. Lack of those features can be worked around by an application (an interface, or logic can be implemented to handle cases of orphaned records). But they are nice features for a data source to have, which make the [client] application logic and interface simpler and more straight-forward.

Whether the scope/intent of JSON:API has room for these features is still debatable, but from a practical and conceptual perspective I think this shows that they would be desirable.

@ethanresnick

This comment has been minimized.

Member

ethanresnick commented Nov 10, 2017

There has been so much evidence (in this thread and others) that sideposting a good idea and a needed feature. @dgeb and I have agreed that it will be coming to JSON:API with 1.1, independent of the broader operations proposal. The only blocker now is some details about how it should work. Those are being discussed in #1197. That should be the canonical issue for now if you want to contribute to the discussion around sideposting. We'll also have an issue, I imagine, for the discussion around operations when @dgeb gets out a new draft of that. For now, though, I'm going to lock this issue.

@json-api json-api locked and limited conversation to collaborators Nov 10, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.