Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split up NodeAddress into dedicated classes #4564

Open
bwaidelich opened this issue Sep 27, 2023 · 14 comments
Open

Split up NodeAddress into dedicated classes #4564

bwaidelich opened this issue Sep 27, 2023 · 14 comments

Comments

@bwaidelich
Copy link
Member

Currently the NodeAddress in the Neos\Neos\FrontendRouting namespace has a couple of issues:

  • It encapsulates contentStreamId and workspaceName even though the latter can be derived from the former (there are reasons for this, but it makes this object a little bit quirky)
  • To support linking across sites/CRs (Cross-ContentRepository linking #4441) we would have to add yet another property contentRepositoryId to the object even though that information is not relevant for its most important usecase (serializing a node)

Both not major issues, but it turns out that this object is used in many many places already for completely different purposes which it was never meant to be used for:

  • As DTO just to pass contentStreamId, DSP and node id around (~30 cases like this one)
  • To determine whether the node is in live workspace (~10 cases like this one
  • To serialize the node address (~40 cases like this one)

IMO only the last usecase fits the object and it would be worth to introduce dedicated DTOs for each:

  • One for addressing a single node globally
  • One for (de)serializing a node from/to a string/URI
@bwaidelich bwaidelich added the 9.0 label Sep 27, 2023
@bwaidelich
Copy link
Member Author

Suggestion

Both usecases (addressing and (de)serializing nodes) seem to be very common, so I would suggest to add those new DTOs to the Neos.ContentRepository.Core package:

namespace Neos\ContentRepository\Core\SharedModel\Node;

final readonly class NodeIdentity {
    public function __construct(
        public ContentRepositoryId $contentRepositoryId,
        public ContentStreamId $contentStreamId,
        public DimensionSpacePoint $dimensionSpacePoint,
        public NodeAggregateId $nodeAggregateId,
    ) {}
}
namespace Neos\ContentRepository\Core\SharedModel\Node;

final readonly class NodeSerializer {
    public function __construct(
        private WorkspaceName $workspaceName,
        private DimensionSpacePoint $dimensionSpacePoint,
        private NodeAggregateId $nodeAggregateId,
    ) {}

    public static function fromString(string $value): self { /* ... */ }

    public static function fromUri(UriInterface $uri): self { /* ... */ }

    public function toString(): string { /* ... */ }

    public function toQueryString(UriInterface $uri): UriInterface { /* ... */ }
}

And maybe a third one to be able to (de)serialize nodes with their CR id – Alternatively the NodeSerializer could have parameters for this

@mhsdesign
Copy link
Member

Since the NodeAdress is a concept that will be used frequently by the user we should definitely polish this concept.

Main usecase would be to serialize a node into the html so a javascript application can used it as get parameter and on the server we deserialize it.

@bwaidelich
Copy link
Member Author

Main usecase would be to serialize a node into the html so a javascript application can used it as get parameter and on the server we deserialize it.

Just a little side note: It might be tempting to add corresponding methods to the Value Object itself, but I would suggest to keep URL/HTML/JSON,.... specifics out of it and have a separate class for (de)serialization (as suggested above)

@mhsdesign
Copy link
Member

mhsdesign commented Nov 12, 2023

yes separating this makes perfect sense!

i think we need two (or three) main serialization formats that we would use ourselves:

1. uri parameter serialization ?node={workspace}__{dimensionBase64}__{nodeId}

  • example: ?node=user-admin__eyJsYW5ndWFnZSI6ImVuX1VTIn0%3D__995c9174-ddd6-4d5c-cfc0-1ffc82184677
  • used for routing, as only live nodes have speaking url paths
  • could be used by neos integrators when serializing nodes into the frontend js application
  • deserialization needs crId (which is known through the host http://127.0.0.1:8081 because it maps to as site entity which maps to a content repository id)

2. custom node protocol serialization node://{nodeId}

  • example node://f0eb464e-4701-d909-7023-34f06056bf7c
  • used for simple linking in ck editor
  • deserialization needs crId, dsp, csid

3. NEW fully qualified serialization format possibly json with the fields contentRepositoryId, workspaceName, dimensionSpacePoint and nodeAggregateId

  • example {"contentRepositoryId":"default","workspaceName":"live","dimensionSpacePoint":{"language":"en_US"},"nodeAggregateId":"995c9174-ddd6-4d5c-cfc0-1ffc82184677"}
  • could be used for fusion uncached @cache de(serialization) 9.0 Avoid getActiveRequestHandler in node property mapper for uncached fusion rendering #4732
  • could be used by neos integrators when serializing nodes into the frontend js application, especially needed when working with multiple cress
  • deserialization doesnt need anything

4. NEW fully qualified custom node protocol serialization node://{crId}__{workspace}__{dimension}__{nodeId}

  • example node://default__user-admin__eyJsYW5ndWFnZSI6ImVuX1VTIn0%3D__995c9174-ddd6-4d5c-cfc0-1ffc82184677
  • could be used for cross dimension & content repository linking see Cross-ContentRepository linking #4441
  • deserialization doesnt need anything

@kitsunet
Copy link
Member

kitsunet commented Nov 12, 2023

I don't see why 1, 3 and 4 are separate things, you can transport json via uri perfect fine, and why would you ever want a non fully qualified format. That said 4 (but please more sensible separator character) could work too for me, just that json also works in other "contexts" than a URI.

I alson wonder about 2, the stored uris here need to be fully addressable in terms of the cr, so it needs to contain the nodeAggregate and ContentRepositoryId. How else would you cross link to a node in another cr?

@bwaidelich
Copy link
Member Author

bwaidelich commented Nov 12, 2023

I don't see why 1, 3 and 4 are separate things

Yes, I was wondering the same; can't we just come up with a common serialization format that we use in URIs (and potentially in other places)?

Some central "serializer" class that can convert node addresses to JSON vice versa.
For example (slightly different from my suggestion above):

final readonly class NodeSerializer {
    private function __construct() {}

    public static function nodeAddressFromJson(string $json): NodeAddress { /* ... */ }

    public static function nodeAddressToJson(NodeAddress $nodeAddress): string { /* ... */ }
}

By the way: With #4708 we might not need the contentStreamId at all in the NodeAddress!
And, related to that: I think we should never have contentStreamId and workspaceName in one DTO.

Question: What about the CR Id that is part of 3. and 4. but not of 1. (NodeAddress)?
It could be two separate methods on that serializer above, along the lines of:

// ...

    public static function contentRepositoryIdFromJson(string $json): ContentRepositoryId { /* ... */ }

    public static function NodeAddressAndContentRepositoryIdToJson(NodeAddress $nodeAddress, ContentRepositoryId $contentRepositoryId): string { /* ... */ }

...but I'm not sure about that.

I alson wonder about 2, the stored uris here need to be fully addressable in terms of the cr,
so it needs to contain the nodeAggregate and ContentRepositoryId. How else would you cross link to a node in another cr?

The idea was to keep support for: node://<node-id> to resolve nodes of the current CR and add support for node://<cr-id>/<node-id> or node://<node-id>@<cr-id> for cross CR links, but the ConvertUris implementation does not yet support this (see #4441)

@kitsunet
Copy link
Member

kitsunet commented Nov 12, 2023

This suggests to me the NodeAddress should be a more internal thing that is localised to hand addresses around within a defined content repository. for any more "public" use it seems important to know the content repository as well?

As in it should never be serializable because it is incomplete and would always need further context to work?

@kitsunet
Copy link
Member

To make a further point, we already have an absolute serialisation format that is highly specialised, the live URI. This can include the CR hidden in the domain, the dimensions, the nodeAggregate and by virtue of being specialised the workspace. So it's a fine serialization format as it includes all necessary information to deserialize into a specific node.

@bwaidelich
Copy link
Member Author

bwaidelich commented Feb 9, 2024

After our discussion today, I would suggest to replace the contentStreamId from the DTO and refer to the workspaceName instead:

namespace Neos\ContentRepository\Core\SharedModel\Node;

final readonly class NodeIdentity {
    public function __construct(
        public ContentRepositoryId $contentRepositoryId,
        public WorkspaceName $workspaceName,
        public DimensionSpacePoint $dimensionSpacePoint,
        public NodeAggregateId $nodeAggregateId,
    ) {}
}

because we probably don't want to expose the contentStreamId most of the cases (we'll probably still need a separate DTO for this for internal node addressing)

One of the goals would be to replace the Neos\Neos\FrontendRouting\NodeAddress in almost all places.

Another goal is to make mapping of nodes in controllers easier, for example:

final class SomeController extends AbstractController {
  public function __construct(private readonly NodeSerializer $nodeSerializer) {}

  public function someAction(NodeIdentity $nodeIdentity): void {
    $node = $this->nodeSerializer->nodeFromIdentity($nodeIdentity);
  }
}

Note: We won't be able to allow a mapping to the Node read model because it does not contain all required information (e.g. workspace name) – but mapping entities is a very bad practice anyways *g

@mhsdesign
Copy link
Member

Just to add another note from the previous weekly (9.2.). We discussed that a simple json presentation of the NodeIdentity should replace the current NodeAdress->serializeToUri. That means we do not need any special formats but simple json.

The reasons for this were that delimiting by special characters like _ is always tedious to work with and might even cause odd errors like: #4345

Instead a simple json string representation like {"contentRepositoryId":"default","workspaceName":"live","dimensionSpacePoint":{"language":"en_US"},"nodeAggregateId":"995c9174-ddd6-4d5c-cfc0-1ffc82184677"} contains all informations and is self describing.

About the possible implications that the contentRepositoryId is over-specified once by NodeIdentity and by being inferred to the urls domain, we agreed that this is merely and aesthetic problem with no technical downsides.

@mhsdesign
Copy link
Member

mhsdesign commented Mar 15, 2024

@grebaldi and me discussed this matter.
We want to keep the encoding short as query param space is limited and also to reduce overhead in the redux store where we load all node identities.

For our main backend neos/content/ for a clear intention we propose that we use actual slashes and routing:

pro

  • easy to read (as developers) and easy to possibly modify (goto node x)

con

  • hard to use (custom flow routing configuration with routing arguments)
localhost:8080/neos/content/default/live/language=en_US&audience=intelligent+people/995c9174-ddd6-4d5c-cfc0-1ffc82184677

alternative as root level query params

localhost:8080/neos/content?contentRepositoryId=default&workspaceName=live&dimensionSpacePoint%5Blanguage%5D=en_US&diemension%5Baudience%5D=nice+people&nodeAggregateId=995c9174-ddd6-4d5c-cfc0-1ffc82184677

And for service endpoints where we want to serialise the node identity as query parameter for simplicity

pro

  • easy to use, simple decode as query parameter into any endpoint

con

  • hard to read and debug, due to the escaping of special characters (only _-.; are allowed)
localhost:8080/neos/service-get-title-of-node?node=default_live_GFuZ3VhZ2U9ZW5fVVMmYXVkaWVuY2U9aW50ZWxsaWdlbnQrcGVvcGxl_995c9174-ddd6-4d5c-cfc0-1ffc82184677

The dimensionspacepoint will be calculated via this expression:

base64_encode(http_build_query(['language' => 'en_US', 'audience' => 'nice people']));

which results in a shorter string than if json_encode is used instead of http_build_query

@kitsunet
Copy link
Member

kitsunet commented Mar 15, 2024

does anyone really need to read this? Also the latter variant somehow contains space characters? implicit ordering might also be dangerous-ish? aaand base64 can add = as padding wihch then is not allowed in get parameters. so I guess rather urlencode is needed....

@kitsunet
Copy link
Member

Actually why ont combine both to some degree?

default/live/language%3Den_US%26audience%3Dintelligent%2Bpeople/995c9174-ddd6-4d5c-cfc0-1ffc82184677

would work both as path part as well as node= GET param. The url encode is strictly not necessary when used as path segment but doesn't harm either. (and I guess the http framework takes care of that anyways so we don't have to).

Yes we need a route part handler for this, but that seems somewhat trivial?

@mhsdesign
Copy link
Member

mhsdesign commented May 18, 2024

i like the idea of being compatible for both.

what do you say about it:

yield 'no dimensions' => [
    'nodeAddress' => NodeAddress::create(
        ContentRepositoryId::fromString('default'),
        WorkspaceName::forLive(),
        DimensionSpacePoint::createWithoutDimensions(),
        NodeAggregateId::fromString('marcus-heinrichus')
    ),
    'serialized' => 'default/live//marcus-heinrichus'
];

yield 'one dimension' => [
    'nodeAddress' => NodeAddress::create(
        ContentRepositoryId::fromString('default'),
        WorkspaceName::fromString('user-mh'),
        DimensionSpacePoint::fromArray(['language' => 'de']),
        NodeAggregateId::fromString('79e69d1c-b079-4535-8c8a-37e76736c445')
    ),
    'serialized' => 'default/user-mh/language%3Dde/79e69d1c-b079-4535-8c8a-37e76736c445'
];

yield 'two dimensions' => [
    'nodeAddress' => NodeAddress::create(
        ContentRepositoryId::fromString('second'),
        WorkspaceName::fromString('user-mh'),
        DimensionSpacePoint::fromArray(['language' => 'en_US', 'audience' => 'nice people']),
        NodeAggregateId::fromString('my-node-id')
    ),
    'serialized' => 'second/user-mh/language%3Den_US%26audience%3Dnice%2Bpeople/my-node-id'
];

this is only bit weird:

default/user-mh//79e69d1c-b079-4535-8c8a-37e76736c445

dsp is encoded via:

parse_str(urldecode($encoded), $parsed);
return self::instance($parsed);
return urlencode(http_build_query($this->coordinates));

PR: -> #5067

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

3 participants