Skip to content
This repository has been archived by the owner on Nov 23, 2021. It is now read-only.

define and implement Metadata protocol between client and server #217

Closed
MarcoRose opened this issue May 7, 2015 · 21 comments
Closed

define and implement Metadata protocol between client and server #217

MarcoRose opened this issue May 7, 2015 · 21 comments
Milestone

Comments

@MarcoRose
Copy link
Member

preface:

  • Please define and discuss scope first - Most of the below mentioned metadata is already implemented.
  • Each metadata should be defined in the context of the component that uses this metadata (e.g. page is used by search-component, uuid is handled by logger)
    • documentation as important as implementation here

cite from a yammer post by Angel: "define a clear protocol between the client and server for Devon

With protocol I mean defining some fields that every request/response should contain. This will be built on top of OASP guidance (or could be incorporated to the open source project). The aim is to make Devon a "turn-key" solution where some decisions are already taken. Think on pagination for example, or error messages.

On the spanish version of Devon we use an structure like this for the request.

{
“bodata” : { “name”: “xxx”, “city” : “valencia” },
“bopage” : { “limit” : 10, “page” : 2}
}

on the response there are fields like "errors" which are objects with properties code/message for each error on the server.

Having a protocol opens for new possibilities to add value on top of Devon. For example:

timestamp
And on a project we added a “timestamp” key for evaluating the time and performance from the client side. This is very useful on mobile apps to actually show the time spent on the wire.

validation errors
Ability to associate validation errors (on individual fields / whole submission)

server side data
On another project we used meta-data for sending data “from the server” re-using one request from the client. That way you don’t need a new technology for downstream information. It was used for notifications, but of course it is not real time (the messages were stored on a queue waiting for a request/response to be delivered)

uuid
For a customer we are developing a solution where we will send an uuid from each request coming from the client, and that uuid will be show on each log and sent on each WS it calls. That way we will have complete traceability for the user actions.

security
Force the usage of a top-level JSON object, instead of being able to return a top-level array, to prevent CSRF attacks on GET-endpoints"

@amarinso
Copy link
Member

amarinso commented May 7, 2015

As already stated by @hohwille there is already something implemented both for defining errors on server side and using a CorrelationID (the uuid on the proposal) on the headers of the request.

We can also add a timestamp on another header, or make the timestamp part of the CorrelationID identifier. Also the name of the original application could be useful if this CorrelationID travels across applications (i.e through WS). Sample:

[app]-[date]-[random]

sampleapp-1430987859175-abcdefghijk...

My only concern with using headers for communication is that it's a problem if using JSONP as we can't get the headers of the response on client side.

So for example, server side data cound't be possible with JSONP if using the headers. But again maybe the correct and recommended way to deal with server side push communication is by using websockets. If is this the case we should also define a protocol within websockets becouse as it is, it is quite low level.

Also there is still no broad support for websockets on all application servers, I don't know if we should provide a mechanism for fallback.

Validation errors on the response may be used to identify on which UI control is the error and mark it as invalid. Usually this validation can be done client-side but there are times when this has to be validated on server-side. Think about the very usual case where you are filling a registration form and you fill in the "username" field:

  • A request is made to the server
  • Server validates each field and responds with several errors for each field
  • on client side the error can be redirected to the input field.
-- request --> POST  /form

{ 
  "username" : "angel",
  "password" : "secret"
}

<-- response --

{
   "errors" : {
      "username" : [
            "The name is already in use",
      ],
       "password" : [
             "Password is too short",
             "Password must contain numbers"
      ]
   }
}

For security please @henning-cg elaborate on the issues with responding with a JSON array.

As marco said documentation is here as valuable or more than implementation so I propose that we add a chapter defining this protocol for OASP.

To recapitulate:

  • Should we use headers for metadata?
  • Should we use websockets for push?
    • if so, should we define what's travelling inside the wire?
  • Should we define a general format for server-side errors?

@hohwille
Copy link
Member

hohwille commented May 7, 2015

We can also add a timestamp on another header, or make the timestamp part of the CorrelationID identifier.

On the server side the CorrelationID is just a string that is also added to MDC and therefore logged with every related log message. If the client/requester does not provide a CorrelationID a UUID is generated for the request.

@hohwille
Copy link
Member

hohwille commented May 7, 2015

For oasp4js we wanted to generate a uid on client startup valid for the session. Then we want to build correlation ID by added a simple request counter to that uid. So you have uid-req. In the logfiles (using logstash and elasticsearch) you can then correlate all log entries either by the exact correlation ID or e.g. just by uid and then see all log events send by one client session. Still you do not know what user it is except there is someone logging the user name (what is prohibited by our logging conventions). Just to note for privacy.

@hohwille
Copy link
Member

hohwille commented May 7, 2015

Validation errors on the response may be used to identify on which UI control is the error and mark it as invalid.

We are using javax.validation and the exception facade is transforming the entire constraint violations to JSON. So if you are providing the correct IDs in you validation annotations you could map that to the UI. Still I am quite concerned about too tight binding between server-side logic and client-logic. You can still have some mapping logic for that on the client side as well. OASP should be not in the way here.

@hohwille
Copy link
Member

hohwille commented May 7, 2015

If you observe some detailed issue where the behavior in our sample app is not as you expect, then let us talk about that in detail.

@MarcoRose
Copy link
Member Author

great, a lot of ideas..

Let's find a assignee for the task, who can go into detail and work on the solution. Angel, do you have somebody in mind?

@henning-cg
Copy link
Contributor

The security aspect with the format of the json response I was referring to is with respect to GET requests to the API which return a JSON array as the top-level object (see for example https://www.owasp.org/index.php/OWASP_AJAX_Security_Guidelines under the title "Always return JSON with an Object on the outside").

The recommendation is to always return a JSON object as the top-level object.

@amarinso
Copy link
Member

amarinso commented May 8, 2015

So to follow the OWASP recomendation we should wrap our responses to be always an object.

Maybe we can have an interceptor and convert the response to the format:

{ "result" : <result> }

But if used this way I don't know if we loose the ability to document the API with tools like swagger if they only introspect the Rest classes.

Alternatively a base generic WebDTO should be derived for the responses on the Rest controller.

Or we skip this OWASP recommendation consciously, but that way we are cheating ourselves by recommending it on other situations.

@hohwille
Copy link
Member

Thanks for pointing this out. I did not yet understand how the exploit is going to work? Can someone explain the actual vulnerability? In most cases we are already using explicit transfer objects. Only for simplicity in queries we simply return the list of hits. I was already suggesting to use a dedicated object that could also contain additional info such as a flag if the hits are complete or if there are additional hits available for paging. However, for simplicity so far I was overruled to just return a List of the hits.
Seems we should revisit this...

@henning-cg
Copy link
Contributor

Yes, I would recommend to revisit this, as this vulnerability was used, afaik, succesfully against Gmail some years ago, for example.

The problem lies in the fact that the only JSON structure which is also valid Javascript is if the JSON's top level object is an Array.

To attack an endpoint which returns such a JSON structure, it has to be reachable via GET. You then do the following steps:

  • Override the Array constructor
  • Load the GET endpoint via a simple SCRIPT tag
  • As the JSON returned is valid Javascript, what happens is that your overridden Array constructor is called
  • In this overridden constructor, you save the array to a place reachable by other Javascript code of you

There are several ways to mitigate this. Some just insert manually a string before the JSON response, which would explode if parsing the response like Javascript. You then have to remove this string manually before parsing it as JSON.

But the simplest way is simply to forbid JSON arrays as top-level objects.

@MarcoRose
Copy link
Member Author

While there already is a fruitful discussion on metadata I wanted to suggest a general idea how to document.

Please create a crosscut-chapter Metadata (maybe in https://github.com/oasp-forge/oasp4j-wiki/wiki/guide-service-layer) showing a table of all metadata that is (or will be) provided by oasp. List each metadata-attribute, provide a very brief (1 line) explanation what it is for and link it to a respective chapter in the documentation, where the component using the actual metadata is described in depth. E.g. bopage is used by a search-component in the logic layer.

If there is no in-depth component description to add metadata-usage, create it (or post list of missing components here).

@henning-cg
Copy link
Contributor

I started to add a metadata section to our service layer guide. For now I only included correlation id, validation and pagination, as only the first one is currently implemented completely, and the other two are worked on.

The security aspect (only returning objects) might be solved once we have full pagination support baked in, as then all operations returning a list will probably use some kind of base pagination TO.

@henning-cg
Copy link
Contributor

In order to have a foundation for further discussion, I would like to propose the following definition for a basic protocol for rest communication between client and server:

Request

  • Body is always JSON (no surprise here)
  • We do not define a special structure for normal entity creation or updates

Requests including pagination / searching

{
    // Pagination params (if requesting pagination)
    pagination: {
        page: 2,
        pageSize: 25
    },
    // Additional search params, no special format defined
    name: "a part of the name",
    state: "some kind of state"
}

Response

Successful responses (HTTP 2xx)

List of entities

{
    result: [array_of_entities],

    pagination: {
        page: 2,
        pageSize: 25, // actual page size is returned, in case the server did not accept the page size proposed by the client
        totalEntries: 2371
    }
}

Single entity

{
    result: the_actual_result_of_the_endpoint
}

The reason the single entity is also wrapped inside a result attribute is two-fold:

  1. Have the same base structure as in lists
  2. Have room for extending with other metadata in the future

There are of course downsides to this:

  1. Right now we do not need it
  2. If using the same ETO for a SOAP service, the message sent gets more complicated, too

Errors (HTTP 4xx, 5xx)

General error:

{
    message: "A readable error message",
    code: "error code",
    uuid: "some kind of identifying (correlating) id"
}

Validation error:

{
    message: "A readable error message",
    code: "error code",
    uuid: "some kind of identifying (correlating) id",
    errors: {
        "fieldname1": [
            "message1",
            "message2"
        ],
        "fieldname2": [
            "message3"
        ]
    }
}

@henning-cg
Copy link
Contributor

@amarinso @hohwille @MarcoRose Any input on the proposal? 😄

@amarinso
Copy link
Member

I think if pageSize is nested on pagination property maybe it can be shortened to simply size

Also totalEntries could just be total

We should agree on this to unlock the tasks and also it has direct impact on the sample application that we have to properly adapt to the new protocol.

@hohwille
Copy link
Member

I see an issue with totalEntries or total in general. This means you always have to run two queries - the regular query and a count query. That is a waste of performance that I dislike. If that info is required it should only be requested by the client upfront and only once and not for each page switch. IMHO we should only provide if the current page is complete or not (what can be easily provided without reasonable performance overhead).
Please note that modern solutions switch from paging to lazy loading/infinitve scolling.

@hohwille
Copy link
Member

Here an example for infinite scrolling:
http://ui-grid.info/docs/#/tutorial/212_infinite_scroll

Also e.g. web-de or google do that for their mail clients...

@hohwille
Copy link
Member

Our "protocol" should support both use-cases: paged view and infinitve scrolling.
Your suggestion is fine. I only would remove the total and make that an extra and optional call.

@amarinso
Copy link
Member

Maybe if the request includes the totalproperty then the server has to respond with the total thus only clients interested on synchronously wait and get this info.

@amarinso
Copy link
Member

amarinso commented Jul 1, 2015

All this has been addressed on release 1.3.0. We can close this issue

@amarinso amarinso closed this as completed Jul 1, 2015
@hohwille hohwille modified the milestones: oasp:1.3.0, backlog Jul 2, 2015
@hohwille
Copy link
Member

hohwille commented Apr 22, 2016

Maybe we should revisit this one. There seems to be an existing open approach on the web that has already a wide acceptance and based on this supports some tooling, etc.:
https://prometheus.io/docs/querying/api/
WDYT? Shall we reopen?
Also or even more this somehow affects our approach with the "correlationId".

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants