Skip to content
This repository has been archived by the owner on May 3, 2023. It is now read-only.

Resolve URIs dynamically through prefix mapping. #8

Open
osamesama opened this issue Mar 30, 2022 · 3 comments
Open

Resolve URIs dynamically through prefix mapping. #8

osamesama opened this issue Mar 30, 2022 · 3 comments

Comments

@osamesama
Copy link

I have a folder full of schema files, each defined with an http-based value for $id. To facilitate loading them in json-schema-bundle, it would be nice to have the option to specify how some http URIs are resolved. For example, one could register that URIs beginning with http://domain/path are resolved after mapping them to file://path. Then each schema to be bundled doesn't need to be individually registered with add.

Is such a feature under consideration?

@jdesrosiers
Copy link
Contributor

I'd rather not go that direction. It feels like a hack. When you are working with schemas on the file system, the recommended approach is to not use $ids in your schemas and instead rely on their natural file URIs for identifiers. Your references should all be relative so you won't expose any file system paths that are specific to your local machine.

Another option would be for me to provide a helper function that takes a path to a directory and adds all the schemas in that directory for you. That would be very easy to do.

@osamesama
Copy link
Author

I can see your point, and I certainly respect your desire to avoid hacks. Let’s not pollute such a fine library.

I could be wrong, but I have gathered that the main purpose of $id is to uniquely identify the schema, rather than to merely suggest where one must go to retrieve it — though admittedly it can probably serve both objectives, if we want it to. JSON Schema's allowance of familiar network protocol like HTTP for IDs makes it fairly easy for everyone to determine their own IDs with little fear of collision, thanks to HTTP's built-in namespacing features. Further, HTTP in particular also makes it easy to describe and resolve relative references, promoting concise references to related schemas. But using URLs for IDs in no way mandates that the schemas are actually network retrievable.

So registering the schemas with add, to facilitate local resolution, makes a lot of sense. And if we can also avoid verbosity (and noisy code) by adding an entire folder at once, as you suggest, that certainly solves a lot of problems — probably mine as well. Thanks for considering it. I would use it.

Still, in some cases, there will be situations where it might not be better to bundle all of the required schemas with the package we are building, and at the same time — especially when we are running our software delivery pipeline with multiple deployment stages — we may also not want to presume that blindly following the ID as a URL is going to give us the thing we need. Additionally, when the URI is a URN instead, there may actually be nothing to follow.

There is a saying in software engineering, that every problem can be resolved with an extra level of indirection. I imagine that providing some way to plug in a URL/URN resolver could provide some useful power. This or some other form of strategy pattern, anyway. It seems that json-schema-bundle may already almost have such a thing built-in, as it resolves based on what has been registered, perhaps falling back to network retrieval (just guessing). Obviously, whatever you've already got going there would continue to be the default strategy.

Anyway, it's just something to consider. Done making my case. :-) Thanks for chewing on it and supporting a great library. And the idea you proposed, for "mass registration," will also work just fine for my specific case.

@jdesrosiers
Copy link
Contributor

I have gathered that the main purpose of $id is to uniquely identify the schema

Yes, but that's only part of the story of schema identification. $id allows you to embed a base URI in the schema, but that's only the first layer of the onion of establishing a base URI (RFC-3986 5.1 Establishing a Base URI). The retrieval URI is part of the algorithm too. I suggest reading Understanding JSON Schema - Schema Identification for a JSON Schema specific explanation of how this works.

Loading schemas with add adds it to a cache. The cache is checked first. If the URI isn't in the cache, it attempts to retrieve the schema based on the URI scheme. It would be easy to add a mapping layer, but I don't see how that can fit into the URI spec. Using file URIs is generally the right way to address this kind of problem, but if you can't do that for some reason, things like loading all schemas from a directory are simple to implement and works just as well with only a few extra lines of code.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants