Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataSource Chaining (Master Ticket) #615

Closed
ThePrimeagen opened this issue Nov 12, 2015 · 1 comment
Closed

DataSource Chaining (Master Ticket) #615

ThePrimeagen opened this issue Nov 12, 2015 · 1 comment

Comments

@ThePrimeagen
Copy link
Contributor

Spec

Currently Routers (DataSource) create an atom whenever a requested path does not match any route. Unfortunately this does not give the caller the ability to distinguish between whether the DataSource was able to handle the path request and found it to be undefined, or whether the DataSource was not equipped to handle the path request.

This is suboptimal for a variety of different reasons:

  • It may indicate an error if a client attempts to request a route that does not exist on the server. For example it may be that the client was simply deployed against the wrong version of the server. Obscuring the distinction between an undefined value and a path with no matching route makes it impossible to detect this case.
    • It should be noted here that the changes to the model (specified below), where the model materializes unhandledPaths, will effectively cause the same problem so this argument is moot from the perspective of a model.
  • It bloats message sizes. If a large path set does not match any route, atoms will be generated for every leaf in the path set. This information could be communicated over the wire in a much more dense way.
  • It makes it impossible to chain Data sources (like Router) together. JavaScript objects work using a prototype system. If an unsuccessful attempt is made to retrieve a member from an object, the same lookup is repeated on the object's prototype. If Routers report missing paths, it can be possible to mimic the same prototype walk for DataSources. This is known as DataSource chaining.

Proposed Changes

JSONGraphEnvelope will require a new key to convey that there are unhandled paths. The key will be unhandledPaths.

E.G Example Output from a Router that only has route 'foo.bar' defined.

router.get([['foo', ['bar', 'baz']]]);

{
    jsonGraph: {
        foo: {
            bar: 'hello world!'
        }
    },
    paths: [
        ['foo', 'bar']
    ],
    unhandledPaths: [
        ['foo', 'baz']
    ]
}

The proposed changes will create 3 to 4 separate pieces of work. The order of the list does not dictate the order of work needs to be accomplished in.

Router Changes

The router potentially encounters two types of missing paths during its life cycle: paths in which there are no specified routes for, and paths in which the specified route did not fulfill all the requested paths. The latter case should not be considered for this change as it was not an unhandled path, just a mishandled path. The router will have to note all the paths in which matched no route and report them in the unhandledPaths member of the JSONGraph Envelope.

Paths that are matched but no output is produced will not be materialized as they were in the former implementation. It is the responsibility of the route to produce all paths that are matched. It is odd that the router produced those paths to begin with. Let's look at an example of how this use to work and how it should work.

E.G (imagine videos[999] did not exist).

router.get([['videos', [123, 999], ['title', 'rating', 'description']);

The old way the user could exclude 999 from the output of the route and it would be automagically fulfilled by the routing mechanism.

// Old Way
{
    jsonGraph: {
        videos: {
            123: {
                title: 'Running Man',
                rating: 5.0,
                description: 'A wrongly convicted man must try to survive a public execution gauntlet staged as a game show.'
            },
            999: {
                title: {$type: 'atom'},
                rating: {$type: 'atom'},
                description: {$type: 'atom'}
            }
    },
    paths: [
        ['videos', [123, 999], ['title', 'rating', 'description']]
    ]
}

// New Way
{
    jsonGraph: {
        videos: {
            123: {
                title: 'Running Man',
                rating: 5.0, // out of 4
                description: 'A wrongly convicted man must try to survive a public execution gauntlet staged as a game show.'
            },
            999: {$type: 'atom'} // from the route
    },
    paths: [
        ['videos', [123, 999], ['title', 'rating', 'description']]
    ]
}

By enforcing the route to do the due diligence of producing the entire output the output itself will be more valid. The router can only guess where the undefined values are. The route implementor knows exactly where the undefineds are, thus will always produce as or more correct output than the router.

Handling unhandledPaths from a matched route.
If the route wishes to to specify the set of unhandledPaths then it can through two conventions.

// From PathUnhandledValue
{
    path: [...],
    unhandled: true
}

// From JSONGraph
{
    jsonGraph: {
        ...
    },
    paths: [  ],
    unhandledPaths: [  ]
}

This gives the route the option to intentionally not handle a path and allow a source further down to handle it.

This will incur a major revision change since the change is not additive (though appears to be). Currently, a data source is expected to fulfill all requested paths. This is the reason why a Router fills all paths with materialized atoms where no values are found. Adding unhandledPaths are not additive but fundamentally changing the contract. Even chained dataSources could return unhandledPaths. A Model, without the necessary changes, will hit the "MaxRetryError" with the proposed contract (fix is change 3 under model).

A New Library (falcor-datasource-chainer)

The falcor-datasource-chainer implements the DataSource interface. It will take in a list of DataSource as its constructor. Each source will be checked in array order (this is literally the request queue interview question). The datasource-chainer also has custom merge logic to be able to merge incoming sequential JSONGraph Envs and optimize unhandlePaths. The datasource-chainer will more than likely borrow logic from the router (moving it potentially to a common library).

Get

A get response is considered completed based on the following three conditions:

  • The unhandledPaths key is empty or non-existent.
  • The chain of dataSources have been exhausted.
  • There was an onError from the dataSource. We no longer have the context for chaining. An onError from a dataSource is a catastrophic error, not an error within the dataSource's jsonGraphEnvelope (imagine a 500 from the HttpDataSource).
Set

Set does not make much sense to chain. We will only perform the set operation on sources[0].

Call

The equivalent to unhandledPaths is function does not exist error. If the dataSource is a router and the router does not have a route for the specific call function, the router will throw an error. So if the dataSource errors with aforementioned error then the next source in the chain will be called until a JSONGraph Envelope is returned or the list of DataSource has been exhausted. If the list is exhausted then function does not exist error should be forwarded through the onError channel.

If a call's paths or suffixes produce unhandledPaths then a get request cycle will be used. Since call is required to provide what paths were produced, the call that fulfilled the request must produce all the paths along with unhandledPaths. unhandledPaths should be an optimized proper subset of paths.

On catastrophic errors (imagine HttpDataSource onErrors a 500) call will stop chaining. With unhandledPaths produced from paths / suffixes it is possible to encounter a catastrophic error. If that happens the same rules, from get, can be applied. The call's partial jsonGraphEnv will be merged and the remaining unhandledPaths will be materialized.

Model

Client side materializing. If the sources are exhausted the Model will materialize all unhandledPaths. This will cause a minor version change as it makes an additive change to the model, but still backwards compatible with prior versions of the DataSource interface.

Combining Logic into a single Library

The Subscribable interface, Disposable, and AssignableDisposable, with an unknown more amount of logic is shared between falcor and falcor-datasource-chainer. The common pieces of logic should be moved to a shared npm library.

Error Handling

The DataSource chaining leaves open some serious questions around error handling. If Source[0] emits an error through the onError channel what should happen? What if source[0] is successful but it has unhandledPaths. Source[1] onError's (or even worse throws an error), what happens?

Resolution

On any error we will stop propagating requests from one source to the next. DataSource chaining is not a good solution for fallback values.

Unanswered Questions

  • If DataSourceChainer has 3 sources and source[0] onErrors (say network 500) what is the correct action?
    • Answered above under Error Handling section 11/16/15
  • Matched route but no data, is this an "unhandledPath"?
    • This has been answered: Router change description has been changed to match the resolution. 11/16/15
@ThePrimeagen
Copy link
Contributor Author

Updated comment with more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant