Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON:API Layer Refactor #3964

Open
SychO9 opened this issue Feb 10, 2024 · 3 comments · May be fixed by #3971
Open

JSON:API Layer Refactor #3964

SychO9 opened this issue Feb 10, 2024 · 3 comments · May be fixed by #3971
Assignees
Milestone

Comments

@SychO9
Copy link
Member

SychO9 commented Feb 10, 2024

We currently use tobyzerner/json-api-php for our JSON:API layer implementation. Our implementation which is based on this package requires quite a lot of boilerplate to read and write resources. The primary goal is to reduce the boilerplate needed.

Additionally the package has been abandoned for a long time, though that should not actually matter, it still does work and it is very likely that regardless of the new chosen library, we will need to have a fork of our own where we can implement our own custom changes and unique use cases unsupported by said library (and sync back to the original what seems valid).

Crucial aspects needed in the revamp:

  • Reduced boilerplate for serializing and writing resources.
  • Maintained extensibility of existing resources (read & write)
  • Potentially if applicable, as little changes as possible to the related Extension API.

Current API

Using the current implementation, a developer has to create the following boilerplate for full CRUD of a resource.

Api\Controller
 - CreateGroupController.php
 - ListGroupsController.php
 - ShowGroupController.php
 - UpdateGroupController.php
 - DeleteGroupController.php

Api\Serializer
 - GroupSerializer.php

Api\routes.php
 - ->get('/')
 - ->get('/{id}')
 - ->post('/')
 - ->patch('/{id}')
 - ->delete('/{id}')

Group\Command
 - CreateGroup.php
 - CreateGroupHandler.php
 - DeleteGroup.php
 - DeleteGroupHandler.php
 - EditGroup.php
 - EditGroupHandler.php

*The domain commands are optional, but still used a lot for re-usability, useful to at least keep in mind. as we do need to maintain that re-usability from within.

The implementations of most CRUD controllers are almost always identical. They follow the same usual steps. For example, the creation of a resource usually goes as:

        // Access check
        $actor->assertRegistered();
        $actor->assertCan('createGroup');

        // Making of the resource
        $group = Group::build(
            Arr::get($data, 'attributes.nameSingular'),
            Arr::get($data, 'attributes.namePlural'),
            Arr::get($data, 'attributes.color'),
            Arr::get($data, 'attributes.icon'),
            Arr::get($data, 'attributes.isHidden', false)
        );

        // Pre-saving events
        $this->events->dispatch(
            new Saving($group, $actor, $data)
        );

        // Validation
        $this->validator->assertValid($group->getAttributes());

        // Saving
        $group->save();

        // Post-saving events
        $this->dispatchEventsFor($group, $actor);

        return $group;

Or listing:

        // Request inputs
        $filters = $this->extractFilter($request);
        $sort = $this->extractSort($request);
        $sortIsDefault = $this->sortIsDefault($request);
        $limit = $this->extractLimit($request);
        $offset = $this->extractOffset($request);

        // Querying/(Optional Searching)
        $queryResults = $this->search->query(
            Group::class,
            new SearchCriteria($actor, $filters, $limit, $offset, $sort, $sortIsDefault)
        );

        // Pagination links
        $document->addPaginationLinks(
            $this->url->to('api')->route('groups.index'),
            $request->getQueryParams(),
            $offset,
            $limit,
            $queryResults->areMoreResults() ? null : 0
        );

        $results = $queryResults->getResults();

        // Eager loading
        $this->loadRelations($results, [], $request);

        return $results;

Alternative JSON:API packages

Most of the implementations of the json:api spec (https://jsonapi.org/implementations/#server-libraries-php) have not been updated in some years. Of the rest, only a few are framework agnostic, where the rest are specific to Laravel or Symfony frameworks. The Laravel ones are tightly coupled to Laravel so they expect Laravel's Request and Routing so those are not an option.

I believe the most appropriate package for this to be https://github.com/tobyzerner/json-api-server. From its documentation, the package simplifies the boilerplate necessary tremendously, its API is very readable and well put (and the code behind it is simple).

class UsersResource extends EloquentResource
{
    public function type(): string
    {
        return 'users';
    }

    public function newModel(Context $context): object
    {
        return new User();
    }

    public function endpoints(): array
    {
        return [
            Endpoint\Show::make(),
            Endpoint\Index::make()->paginate(),
            Endpoint\Create::make()->visible(Laravel\can('create')),
            Endpoint\Update::make()->visible(Laravel\can('update')),
            Endpoint\Delete::make()->visible(Laravel\can('delete')),
        ];
    }

    public function fields(): array
    {
        return [
            Field\Attribute::make('name')
                ->type(Type\Str::make())
                ->writable()
                ->required(),

            Field\ToOne::make('address')->includable(),

            Field\ToMany::make('friends')
                ->type('users')
                ->includable(),
        ];
    }

    public function filters(): array
    {
        return [Filter\Where::make('id'), Filter\Where::make('name')];
    }
}

If we create an abstraction layer on top, we can have this integrated into Flarum to follow our internal needs and deal with changes from the underlying package without breaking our own API.

That would also enable us to enforce using our own filtering and searching systems, and preserve the same similar behavior of event dispatching like outlined in the previous section.

We will not escape having to fork and adapt any package of choice however as they are bound to have certain behavior we need changed. For example:

  • Validation of attributes is scoped to the attribute's value only. Meaning validation rules such as required_with will not work.
  • Once we pass control of the /api endpoint we would no longer be able to create simple controller endpoints in /api such as /api/notifications/readAll. So we need to change how routes lead to each resource endpoints.

Propositions

There are two paths forward here:

  • Although abandoned, we could keep the current json-api package and adopt it under flarum to continue maintaining it ourselves. We would then built the abstraction layer we require on top of it, or refactor it as needed, this would be a lot more work of course.
  • We can go with the suggested package in the previous section. But still fork it to adapt it where necessary, and then add an abstraction layer either within core or the forked package (preferably core if not too much code).

The decision between the two will come down to some experimentation, but I feel it likely that the second option is the best.


--> going with option 2

Changes needed

BC Layer

Instead of directly using classes from the library, which is currently in beta and will have breaking changes. We need a layer on top to facilitate updating or switching to a different library. So for example, instead of directly extending \Tobyz\JsonApiServer\Resource\AbstractResource we create our own AbstractResource to extend from.

Routing

Instead of giving control of the entire /api endpoint over to the library, and therefore only resources. We need to selectively create the endpoint routes of each resource into our router (/api/resource). And invoke the library on each individual resource route. This maintains the ability to create /api routes other than resources.

Saving Flow

Flarum's flow of saving is:

  • authorization -> fill data -> saving hook(event) -> validation -> save -> post-save-hooks(events).

The library's is:

  • authorization -> validation -> fill data -> saving hook -> save -> post-save-hooks.

To maintain BC for the various model Saving events, we need it dispatched before validation, though we will very likely just break BC for this one.

Validation

The way validation works in the library is that each field is given its value from the request, and the field applies whatever validation, in isolation from other request data. This will not work for us as we need to be able to validate for example, that a field is required when another was not provided.

The validation relies on manual callbacks to be provided with the ability of passing Laravel rules using a helper:

->validate(function ($value, $fail, $context) {
	// ..
})

->validate(Laravel\rules([...]))

Since we entirely rely on illuminate\validation we need to change things to allow directly passing illuminate rules, then gather all of that, and in the validation process use a single validator with all the data.

Schema\Str::make('username') 
    ->rule('max:30')
    ->rule('min:3')
    ->rules(['regex/^[a-z0-9_-]+$/i'])

Which would also allow us to more easily create helper validation methods:

Schema\Str::make('username')  
    ->requiredOnCreate()  
    ->unique('users', 'username', true)  
    ->regex('/^[a-z0-9_-]+$/i')  
    ->validationMessages([  
        'username.regex' => '...'
    ])  
    ->minLength(3)  
    ->maxLength(30)

Command Bus (domain commands, not console)

Flarum was built with the concept of command bus (not the console) where a domain operation takes place without any strong coupling to the controller. This makes calling these operations in different other contexts easily possible:

new CreateGroup(User::find(1), ['attributes' => [  
    'nameSingular' => 'test group',  
    'namePlural' => 'test groups',  
    'color' => '#000000',  
    'icon' => 'fas fa-crown',  
]])

We need to maintain the ability to re-use the endpoint logic. The endpoint itself is pretty well isolated. It only needs a context object with a request.

With custom changes, we will have something along the lines of:

$group = $api->forResource(GroupResource::class)  
    ->forEndpoint(Create::class)  
    ->execute([  
        'attributes' => [  
            'nameSingular' => 'test group',  
            'namePlural' => 'test groups',  
            'color' => '#000000',  
            'icon' => 'fas fa-crown',  
        ]  
    ]);

Relationship linkage vs Inclusion

Flarum has the unusual case of the show discussion endpoint, which adds the linkage of all the discussion's posts, but only includes a subset (This behavior is the basis for the post stream scrubber feature on the frontend).

The library does not separate between the values used for linkage and inclusion. So at the moment we have to override its Serializer class to allow this distinction. Though preferably we need to look into a different solution for the post scrubber feature that doesn't require this behavior. I avoided that here to not go out of scope.

Eager loading & the serialization process

Problem 1

The library handles eager loading of included relationships by deferring getting the value of each relation and using a buffer of model relations to eager load.

Technically, eager loading does not just apply to included relationships however, all it takes is for a field visibility callback or getter callback to access a relation for that relation to require eager loading, so we need a manual API to select which relations to eager load on X endpoint.

We can't just port the exact logic we already use in our current implementation however:

  1. We want to be able to manually eager load needed relations.
  2. We don't want to replace the library's eager loading of included relations because that takes care of scoping those relations.

We need a hybrid process of:

  1. before serialization, eager load R list of relations that will only be accessed internally and not included in the response.
  2. while serializing eager load S list of included relations with scoping and with R' list of relations we need internally (subset of R).

This is still not perfect however.

Problem 2

The serialization flow of the library is:

  • addToPrimary: Fields are added to a map after a visibility check and resolved immediately by default (unless a callback is used for the getter).
  • serialize : Deferred fields are resolved and added to the map, then the rest of the serialization process happens.

Problem: N+1 Queries are easily produced in this flow, especially coming from the visibility checker, and regardless of the eager loading API we're adding from Problem 1.

Example: Resource has one attribute A and ToOne relationship B, attribute A is only visible when B->col is true. ->visible(fn (object $a) => $a->B->col) Even though the library does its own eager loading of included relationship B, the visibility callback is accessed earlier than that.

Solution 1: Eager load relation B twice. The first one for internal use (visibility checking), the second one for inclusion in the response (scoped).

Solution 2: Change the serialization flow. Defer all the fields, then prioritize resolving the relationship fields first. By the time visibility checking and getters of the attributes is accessed, relationship B will have been loaded. However if internally a relation is accessed which is not meant to be visible publicly, it'll leads to some N+1 queries.

--> Going for solution 1 as it's the least complicated and ensures internally accessed relations are always eager loaded while response included relations are always scoped. While this means some duplicate queries, it is still an improvement over the n+1 issue.

Additions

The excellent implementation of the library allows us to add some Flarum specific features to it (in the goal of reducing boilerplate). This includes both features already existent in the current API implementation (like default includes) or new features.

Default included relationships

Unfortunately the current library has no way of specifying relationships that are by default included in the response document. This needs to be added as it is a thing in 1.x, we can do so by adding a defaultInclude method on endpoints:

Endpoint\Show::make()
    ->defaultInclude(['groups', 'actor'])

However, I would prefer we move away from including relations by default, and instead, specifying on the frontend side on each request, what relationships to include. That way endpoints that require more data than others, don't affect the response size elsewhere. And extensions have taken the habit of just always using default includes which contributes to bad performance.

Custom Endpoints

This is a necessary addition. We have routes such as DeleteAvatarController which return a serialized user model, so it would better to have the ability to add custom endpoints within the resources, rather than try to re-use the JsonApi inside each similar custom route.

Custom request params extraction

Some of our endpoints -like the list posts endpoint- have cusom logic for extracting some of the request parameters -like the offset-. We need to support custom callbacks on at least the Index endpoint (potentially the Show endpoint as well) to allow customizing the logic for the extraction per resource.

Endpoint\Index::make()  
    ->extractOffset(function (Context $context, array $defaultExtracts): int {  
        $queryParams = $context->request->getQueryParams();  
  
        if (($near = Arr::get($queryParams, 'page.near')) > 1) {  
            $sort = $defaultExtracts['sort'];  
            $filter = $defaultExtracts['filter'];  
  
            if (count($filter) > 1 || ! isset($filter['discussion']) || $sort) {  
                throw new BadRequestException(  
                    'You can only use page[near] with filter[discussion] and the default sort order'  
                );  
            }  
  
            $limit = $defaultExtracts['limit'];  
            $offset = resolve(PostRepository::class)->getIndexForNumber((int) $filter['discussion'], $near, $context->getActor());  
  
            return max(0, $offset - $limit / 2);  
        }  
  
        return $defaultExtracts['offset'];  
    })  
    ->paginate()

Endpoint Visibility

The library allows specifying when an endpoint is visible:

Endpoint\Create::make()  
    ->visible(function (Context $context) {  
        $actor = RequestUtil::getActor($context->request);  
        return ! $actor->isGuest() && $actor->can('createGroup');  
    }),

We can add authenticated and can methods to simplify this:

Endpoint\Create::make()  
    ->authenticated()
    ->can('createGroup'),
@SychO9 SychO9 added this to the 2.0 milestone Feb 10, 2024
@SychO9 SychO9 self-assigned this Feb 10, 2024
@SychO9 SychO9 linked a pull request Mar 8, 2024 that will close this issue
2 tasks
@tobyzerner
Copy link
Contributor

Hi @SychO9,

Thanks for choosing json-api-server to power Flarum's new JSON:API layer! It's a great feeling to still be able to contribute to Flarum indirectly. Thank you also for the PRs you've sent to the json-api-server repo - I plan to look at these soon, and I don't foresee any issues getting them merged.

Based on the detail you've included in this issue (and without having looked at your actual fork), I would like to propose that a fork is not necessary, and that all the problems you've mentioned can be solved without one. Of course you are free to fork, but I would argue it's better for both projects if you don't - more improvements go back into json-api-server, and reduces Flarum's maintenance and documentation burden.

Here are some thoughts on everything you've mentioned:

Instead of directly using classes from the library, which is currently in beta and will have breaking changes. We need a layer on top to facilitate updating or switching to a different library.

Fair enough - a stable 1.0 version isn't far off though.

Once we pass control of the /api endpoint we would no longer be able to create simple controller endpoints in /api such as /api/notifications/readAll. So we need to change how routes lead to each resource endpoints.

I would suggest adding /api/* as a catch-all route after custom API routes are added in your router. This way your custom routes will match first, otherwise the request will fall through to the catch-all json-api-server handler if not. I don't think it's necessary to selectively register routes for each of your resources to invoke the library as you've suggested. Also, it is possible to define your own custom Endpoint classes in json-api-server to create bespoke routes for resources. See here.

Flarum's flow of saving is:
authorization -> fill data -> saving hook(event) -> validation -> save -> post-save-hooks(events).
The library's is:
authorization -> validation -> fill data -> saving hook -> save -> post-save-hooks.
To maintain BC for the various model Saving events, we need it dispatched before validation, though we will very likely just break BC for this one.

I agree that it's probably worth breaking BC for this change. However, one of the things I'd like to improve in json-api-server is the ability to have more control over the flow in the Create/Update endpoints, so this is something that could be looked at.

The way validation works in the library is that each field is given its value from the request, and the field applies whatever validation, in isolation from other request data. This will not work for us as we need to be able to validate for example, that a field is required when another was not provided.

There is some discussion about this here. I would suggest using built-in (schema) validation where possible rather than Laravel's validation because then you get a more accurate/useful OpenAPI spec (automatic generation planned for the future). Definitely open to adding more built-in validation functionality to reflect what's supported in the spec. For edge cases where you do need to do some more complex validation, you can do this manually in one of the resource hooks (first solution in the issue linked previously). You could even build this functionality into a subclass of EloquentResource.

We need to maintain the ability to re-use the endpoint logic. The endpoint itself is pretty well isolated. It only needs a context object with a request.

Replacing the command bus with the API itself sounds like a sensible decision. The custom API you've suggested is fine, but this does not require a fork - you could easily build this as a wrapper class which translates into this under the hood:

$request = (new ServerRequest('POST', '/api/groups'))->withParsedBody(['data' => ['attributes' => []]]);
$api->handle($request);

Flarum has the unusual case of the show discussion endpoint, which adds the linkage of all the discussion's posts, but only includes a subset (This behavior is the basis for the post stream scrubber feature on the frontend).

I would suggest changing the frontend to make two requests here rather than trying to wrangle the API layer into this non-standard behaviour.

Eager loading & the serialization process

In my projects, I manually eager-load internally required relationships in the scope method ($query->with('relation')). If you do this, the relation will be available during visibility checking and won't need to be loaded again during serialisation. I also enable Eloquent's strict mode to catch any n+1 queries that slip through.

Default included relationships

I'll be happy to merge the PR you've made for this as the JSON:API spec does allow default includes. However I agree that Flarum should probably move away from them.

Custom Endpoints

I think I've addressed this above.

Custom request params extraction

You could do this by subclassing the Index endpoint. You could also probably subclass OffsetPagination to achieve the same thing.

Endpoint Visibility

Again, could be achieved with a subclass, though I would argue that passing helpers functions into the visible method is just as good.

Happy to discuss anything in more detail - let me know what you think.

@SychO9
Copy link
Member Author

SychO9 commented Mar 10, 2024

Hi Toby 👋🏼, thanks for taking the time to look over this.

I would like to propose that a fork is not necessary, and that all the problems you've mentioned can be solved without one. Of course you are free to fork, but I would argue it's better for both projects if you don't - more improvements go back into json-api-server, and reduces Flarum's maintenance and documentation burden.

That would be the best path forward. I've sent PRs for some changes that seemed would be easily desirable. The rest of the more Flarum-specific wanted behavior I wasn't certain would make sense in the original package, or at least I thought you'd probably want to implement those types of behaviors in your own way (was planning to create issues though, haven't gotten around to that yet).

I think what will happen is we will probably keep the fork until the original package supports the behavior needed/is able to be more extended for some of the behavior needed to be added on Flarum's side (the fork adds a few hooks to allow that). That way there is also no immediate pressure on the original package itself.

I would suggest adding /api/* as a catch-all route after custom API routes are added in your router. This way your custom routes will match first, otherwise the request will fall through to the catch-all json-api-server handler if not. I don't think it's necessary to selectively register routes for each of your resources to invoke the library as you've suggested. Also, it is possible to define your own custom Endpoint classes in json-api-server to create bespoke routes for resources. See here.

This part turned out not to require changes to the package itself, Flarum now has its own child JsonApi class that can support this behavior, so this does not require forking. Though I like the idea as right now to add to router requires early resolution of resource endpoints which is likely to cause issues for extension devs. We would unfortunately lose the nice ability to link to the api routes by name but that may not be such a loss.

I agree that it's probably worth breaking BC for this change. However, one of the things I'd like to improve in json-api-server is the ability to have more control over the flow in the Create/Update endpoints, so this is something that could be looked at.

This is also no longer an issue 👍🏼, the flow in the fork is the same. Extension devs will have to deal with the breaking change of the Saving event no longer dispatching before validation. It is the more correct behavior with the new changes.

There is some discussion about this tobyzerner/json-api-server#81 (comment). I would suggest using built-in (schema) validation where possible rather than Laravel's validation because then you get a more accurate/useful OpenAPI spec (automatic generation planned for the future). Definitely open to adding more built-in validation functionality to reflect what's supported in the spec. For edge cases where you do need to do some more complex validation, you can do this manually in one of the resource hooks (first solution in the issue linked previously). You could even build this functionality into a subclass of EloquentResource.

Yes, while changing the fork to base all validation off of laravel rules I realized the OpenAPI spec would be in a way sacrificed, but in the context of our refactor, trying to just preserve current behavior above new additions it was ok (as that was already very difficult 😅). I have not looked much into the OpenAPI stuff, but it could probably also be generated from laravel rules 🤔

However, the Laravel validation behavior also isn't required on the package level, if the package can just open up the assertDataValid method for overriding or similarly to other spots, check if the resource provides a validation method/validator class (or smth of the sort) then we can have the added laravel rules trait in Flarum itself.

Replacing the command bus with the API itself sounds like a sensible decision. The custom API you've suggested is fine, but this does not require a fork - you could easily build this as a wrapper class which translates into this under the hood:

Yes! though its nice to not have to apply unnecessary serialization in this case. Though that separation of endpoint result vs endpoint response was also made for the sake of the changes to support a custom endpoint in the following manner:

Endpoint\Endpoint::make('deleteAvatar')
    ->route('DELETE', '/{id}/avatar')
    ->action(function (Context $context) {
        // ... logic

        return $context->model; // auto serialized
    })
    ->response(function (Context $context, User $result) {
        // optional, if not provided, result from action will be auto serialized.

        return new Response(204);
    }),

This is very opinionated though, but easier than to create a class for each custom endpoint. Especially since we often have command handlers to dispatch from inside action anyway.

Curious about your opinion on this (though i'm still planning to submit issues on the more opinionated changes, not all was documented here). This is probably the change that requires us a fork the most.

https://github.com/flarum/json-api-server/blob/main/src/Endpoint/Endpoint.php

I would suggest changing the frontend to make two requests here rather than trying to wrangle the API layer into this non-standard behaviour.

That would much better, need to look at this again.

In my projects, I manually eager-load internally required relationships in the scope method ($query->with('relation')). If you do this, the relation will be available during visibility checking and won't need to be loaded again during serialisation. I also enable Eloquent's strict mode to catch any n+1 queries that slip through.

It gets a little more complicated with extensibility. scope is only called on listing and on including (eager loading is also very relevant on other endpoints as we've found). We could allow extending the scope method like we do with fields and endpoints, but it would be a downgrade from https://github.com/flarum/json-api-server/blob/main/src/Endpoint/Concerns/HasEagerLoading.php

But that's also not detrimental, since we can have the trait inside Flarum as well, the only thing we really require of the package here is a hook before serialization: https://github.com/flarum/json-api-server/blob/main/src/Endpoint/Endpoint.php#L120-L122 which ties back to the changes we made for the Endpoint.

We've also added inverse relationship setting, so when an included relation B is serialized for model A and the relation field defines what the inverse relation is, the EloquentBuffer auto sets it back on the loaded relation. I will try to contribute this in the next few days.


To conclude, I believe most custom behavior can be applied from outside the package, with some changes to make the packages open to that. And perhaps the single blockers are the beforeSerialization hook and the Endpoint changes to support that along with custom inline endpoint actions. But also resolving endpoints, fields, sorts and filters then caching them, through additional resolveEndpoints ..etc so that there is a layer to allow adding to resources.

I'll send some PRs to allow simple custom behavior be used (like for the assertDataValid method). But for the endpoint beforeSerialization i'll leave it up to what you want to see done how you want it.

We could override every single endpoint (that's what the first iterations did), but would be nice to avoid.

Thanks again! whatever changes are accepted/made to allow removing the fork, I will test again as it happens and see what blockers might exist every time until we no longer need a fork.

@tobyzerner
Copy link
Contributor

tobyzerner commented Mar 30, 2024

This is very opinionated though, but easier than to create a class for each custom endpoint. Especially since we often have command handlers to dispatch from inside action anyway.
Curious about your opinion on this (though i'm still planning to submit issues on the more opinionated changes, not all was documented here). This is probably the change that requires us a fork the most.

Unless I'm missing something, this could easily exist as a class in Flarum (implementing json-api-server's Endpoint interface) rather than requiring a fork? I do think a separate class for each custom endpoint – in the same way that Controllers are classes in MVC – is a better architectural pattern (reusability, testability, etc). And there shouldn't need to be too many of them if you're building a RESTful API.

I can see the need for a beforeSerialization hook, and happy to implement this in some form. Can you please make an issue for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants