Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why a locate hook is unnecessary #52

Closed
guybedford opened this issue May 30, 2015 · 33 comments
Closed

Why a locate hook is unnecessary #52

guybedford opened this issue May 30, 2015 · 33 comments

Comments

@guybedford
Copy link

In creating the upgrade path in SystemJS for permitting URLs as module identifiers, it has turned out best to deprecate the locate hook. This may be over-explaining the obvious or dwelling on decisions already made, but coming to this conclusion has taken me a surprising amount of consideration so I'd like to describe the reasoning behind this here to retain some reference for the decision and attempt to leave somewhat reasoned feedback.

The question that started this was whether we should re-introduce the locate hook into the specification. Re-introducing the locate hook will enable normalize to normalize module names into a custom schema that can be defined by the loader implementation and form the string names that are stored in the module registry. Locate then handles the final resolution into URLs that can be fetched.

The justification for considering this was to retain compatibility with AMD-style module loading where we have a baseURL-schema. In this schema, modules names are stored in the registry always as plain names relative to some baseURL. jspm also uses its own schema in the registry to refer to modules such as npm:module@x.y.z.

There is a draw to having the sense of storing these universal schema names inside of the module registry as a portable naming system but I'd argue this lure is mostly one of elegance as opposed to practicality.

I've implemented the baseURL-schema normalization in the current SystemJS, and have been experimenting recently with at least three different complete implementations of normalization of a custom schema alongside URLs (the new requirements of the spec, which completely make sense).

In the end, trying to make a custom schema work alongside URLs in the same registry space, ends up causing more issues, for no practical gain.

Dot Normalization

As soon as we allow both AMD-style module IDs relative to some baseURL alongside URLs, the first issue we hit is the need to define "dot normalization". This basically means that relative normalization needs to be defined for the subset of both URLs and non-URLs.

It's not a lot of code, but it is the first sign here that we're duplicating work.

Non-uniqueness

The next issue we have is the non-uniqueness of our schema. This issue here is that import '/local/path.js' is now distinct and separate to the module at import 'local/path.js' in the scenario where baseURL='/' (one resolves as a name and the other as a URL). This will cause confusion as we are allowing the same unique module to be referred to by two different possible names breaking a key principle of the registry being unique.

Having two ways to refer to the same module is a bug waiting to happen, causing problems for configuration (which variation do we configure?), creating the possibility of a module being executed twice, and interfering with bundling workflows.

Expecting the user to know that they should write import('x') instead of import('./x') arbitrarily is a hard ask.

This leads down a road of trying to catch these uniqueness issues in the normalization pipeline itself, which then ends up becoming URL normalization, followed by a reverse normalization into the schema.

All schemas have the non-uniques problem

Schema non-uniqueness with URLs applies to any custom schema chosen that maps to URLs, not just the baseURL system. Even if we come up with the perfect custom naming schema, as soon as we want that schema to co-exist alongside URL requests we hit these issues.

In order to retain unique identification and configuration of modules, one ends up normalizing from schema space into URL space, and then reverse-normalizing back into schema space at the end of normalize, before resolving back into URLs from the schema in locate, just in order to have our perfect schema names stored in the registry.

Add to this the idea of a configuration space consisting of both schema and URL identifiers as well, and this compounds the problem even further.

One ends up swapping between spaces in such a way that URLs become the primary space anyway, and we're just pretending that the schema is the primary space.

Beyond the baseURL-schema

Another common issue with baseURLs is that when back-tracking below the baseURL, we end up with "normalized" paths looking like "../../module.js", which is really not acceptable for a naming system either.

If we return to the question of what AMD's baseURL schema is really trying to accomplish, the core principle is one of portability of modules, which is completely in agreement with what we should be aiming for. URLs are obviously not a portable naming system for modules (modules can move between environments and hence change URL), so the question is simply how to maintain portability of modules in spite of using URLs?

URLs are the schema

It turns out to be very simple to do this - normalization is seen as the process of converting a "portable module name" into an "environment-specific name". And the most environment-specific name is the URL which we store in the module registry.

The concept that we need to have a registry based on our perfect portable schema is flawed. We still keep our schema if we like - which we can bundle into just the same:

System.register('custom:portable/schema', ...);

Where the name above name is normalized into a resolved name of http://www.site.com/packages/custom/portable/schema.js by the loader when being processed and stored in the registry (bundle names are now treated as unnormalized).

There is no big loss that the registry now contains this value under an environment-specific URL instead of the schema. One can just accept that any lookup into the registry must pass through a normalization phase first:

// lookup a module by its schema name by passing through a simple URL-normalization first
Reflect.loader.lookup(Reflect.loader.schemaToURL('custom:portable/schema'));

If an implementor really wants to use a custom schema, make the schema URL-based and add the implementation to the fetch hook so everything works out well anyway:

import 'custom:///portable/schema'

The other consequence of using URLs is that configuration then always goes through a normalization phase itself:

Reflect.loader.configure({
  module: {
    './some/local/module.js': {
      moduleFormat: 'CommonJS'
    }
  }
});

The above would normalize the above configuration into http://site.com/local/path/some/local/module.js.

The benefit of this is that users don't need to understand the special naming schema - they can just reference modules as URLs exactly as they expect and correctly configure things without needing to have studied the system in detail.

One implication here for implementors is that build systems wanting to use portable naming system schemas need to reverse-map the schemas at build time from URLs in the registry, but that is a very minimal cost and a straightforward 1-1 mapping.

I've yet to hear a single use case that is lost by enforcing that the registry is only to store URLs - the justifications for allowing the registry to store a custom schema seem to cling to dated models due to history, while there are many benefits as described to both implementors and users in enforcing URLs as the schema and keeping the locate hook deprecated.

@guybedford
Copy link
Author

One caveat to this - we still allow special modules to be stored in the registry like @empty or @loader. The way we handle this in the latest SystemJS is to not URL-normalize a module if it is a plain name and already in the registry (which can be determined by a simple lookup). This way special environment-independent system modules can still be defined and used. What makes this an exception though is that these modules never get used as a parentAddress to normalize against due to their nature of being set in the registry and not loaded into the registry.

Also to address the plugin argument - http://www.site.com/plugin.js!http://www.site.com/module.js can be fine as a module name - we can handle this by stripping the plugin part in the fetch hook, or when doing relative normalization to this as a parent.

@matthewp
Copy link

This is all predicated on something that I don't understand:

In creating the upgrade path in SystemJS for permitting URLs as module identifiers,

Where was it discussed that this was needed or desirable? Is there a link you can provide? I must have missed this discussion. I'll refrain from commenting on it until I read the arguments.


It sounds like a "custom schema" is a module name and schemaToURL is a synchronous normalize that doesn't take parentName or parentAddress.

Look-up becomes much less useful without module names. How can I lookup jquery, for example? It's key might be http://example.com/myapp/node_modules/some_framework/node_modules/jquery/dist/jquery.js.

I'm going to have to always run resolve if I use lookup now it seems. That works fine for jquery or something else that is likely in the sites table, but what if the parentName matters to how it resolves?

For example, in some situations you want to create distinct modules based on who is importing them. With module names I can create a pattern like:

import register from 'register';

and this normalizes to register/childName. So you could have (in the registry):

register/foo
register/bar
register/something/else

Where foo and bar and something/else are modules that are all importing register.

Your schemaToURL won't work here because I have to do 2 asynchronous resolves before I can lookup the module. I have to resolve foo and then "register/" + fooURL.

No doubt the loss of normalize can be worked around but the question is why should it be given there is already a specced solution in place.

The assumption of dropping normalize is that modules always map directly to files, but that is not the case. You cannot do something like:

import data from 'package.json!json';
import npmData from 'package.json!npm';

These are distinct modules that have different purposes, but happen to be locating the same file. This means you have to create a custom schema (a module name format) to handle this. You'll have to do this any time you have modules that don't map directly to files; you have to hook both resolve and fetch.

How are you going to handle fetch, with your plugin example? You're going to have to write something like:

var fetch = loader.fetch;
loader.fetch = function(key){
  // http://www.site.com/plugin.js!http://www.site.com/module.js
  var url = key.split("!")[1];
  return fetch.call(this, url);
};

Except now all fetch hooks will be receiving the key "http://www.site.com/module.js" which is not the module's actual key. Hopefully those hooks aren't expecting to have the real key (to store in localstorage, or call lookup later or something) because they don't have it.

@matthewp
Copy link

I want to reiterate that last point so that it doesn't get lost:

If there is a need for keys that do not map directly to the url that is fetched it means that any fetch hook cannot trust that the key it is receiving is the module's actual key. This is a big problem. It highlights the need for separate keys and urls.

@guybedford
Copy link
Author

@matthewp thanks for the quick response. I've got this work running on SystemJS master so feel free to play around with it there and see how it all works out too.

From what I've heard, it sounds like there is consensus that designing a loader for the web that does not support loading URLs is really not something that can be considered. URLs are fundamental to the web. I don't have any links, just heard notes from a meeting (quite some time ago) where it was decided that the automatic adding of a .js extension is untenable as a default system that does not support URLs, and that importing from URLs should be possible.

Using URLs does not inhibit contextual normalization at all - import 'register' is allowed to have different meaning depending on the parent context (parentURL). We just manage that context through the URL of the parent by mapping all configuration itself into URL space. (schemaToURL is basically just a sync normalize, and permits a parent second argument for contextual normalization).

I mentioned the plugin pattern in my second comment - plugin systems can still extend URLs to include extra syntax for plugins just fine.

I'm not sure I follow your final argument, which seems to rest on an assumption about the fetch hook that has never been formalized, what use case exactly is affected by this?

@matthewp
Copy link

matthewp commented Jun 1, 2015

From what I've heard, it sounds like there is consensus that designing a loader for the web that does not support loading URLs is really not something that can be considered. URLs are fundamental to the web. I don't have any links, just heard notes from a meeting (quite some time ago) where it was decided that the automatic adding of a .js extension is untenable as a default system that does not support URLs, and that importing from URLs should be possible.

Ah, this is probably the crux of the problem then. If you think about a module identifier as being a pointer to a file then I can see how this requirement would surface. Hopefully we can have this debate again now that the spec is being worked on in the open. I would suggest a mailing list instead of GitHub issues as it's easy to get drowned out by notifications and miss stuff here. Mailing lists are better for discussion and debate.

I mentioned the plugin pattern in my second comment - plugin systems can still extend URLs to include extra syntax for plugins just fine.

I'm not sure I follow your final argument, which seems to rest on an assumption about the fetch hook that has never been formalized, what use case exactly is affected by this?

Yes, my assumption is that the fetch hook does an xhr request for the key (which is a url). I can't see it being implemented any other way.

If you were to override the fetch hook because you have a custom url schema (such as with a plugin system) you will be calling super() with a url that is not the module key. This breaks an important assumption when overriding a hook; that you have access to the module's true key.

@MajorBreakfast
Copy link

  • The URL approach makes sense for the web.
  • I've a feeling that allowing async AND sync normalization (through schemaToURL()) might be problematic as there are use cases that can only be achieved asynchronously (e.g. worker talking to main thread, network request, etc.).
  • System.register('custom:portable/schema', ...) (from above example) has an invalid URL. It should read custom://.... Edit: It's valid.
  • Can you provide some build tool URL examples? I'm guessing you've already thought about this a lot because it's relevant for system-builder. I'm curious what things need to be considered to make the bundles portable.
  • I personally prefer github over mailing lists (a lot)

@jrburke
Copy link

jrburke commented Jun 2, 2015

I can see a loader design that perhaps does not specify a locate hook, with fetch() taking care of that internally, and the loader just having a normalize or resolve. Although something like a .locate() on a module meta for a module has been useful in AMD systems, and I expect that utility to carry forward. But leave that on the side for now.

I still think it makes sense for the ID keys for modules internal to the loader to be module IDs instead of URLs though.

Modules are units of functionality, like functions. The module loader is like a scope object for those code units, and the identifiers for those code units should just be something like an identifier, not a URL.

For module bundling cases and some plugin loader resource IDs that are not URLs, the provider may not be at an actual URL. Best to keep the IDs separate from URLs to enforce that conceptual distinction.


As to the dot normalization for CJS/AMD IDs, this exists because objects should not name themselves in source form. So sub-pieces inside a package need to be able to reference other parts in that package. Instead of seeing the module ID parts as a file path/URL, see it as referencing different parts of an object with a nested property/object structure.

There is a package that provides button. There can be a button/dropdown inside of there. If button/dropdown needs to refer to functionality in button, the '../button' allows dropdown to refer to that functionality without knowing the global name.

Using '/' instead of '.' for the parts in a whole package object definition makes it easy to use '../' or './' without knowing the global object name or needing to to invent a new syntax that does not conflict with '.', if '.' was instead used for separating object parts (so button.dropdown vs button/dropdown.

It also turns out to be helpful for a simple ID-to-path conversion, if a file fetch is needed. That just helps reduce the number of indirection layers, so creates a simpler overall system, but not the primary goal of the IDs.


If the argument is that not being able to use URLs for IDs as something that is not fundamentally of the web, I think this is assuming too much on the JS module system related to the old world of script src tags. As illustrated above, modules IDs are a replacement for object.sub.value references that allow async tracing and resolution before triggering execution of the module body.

Perhaps the "URLs must be supported" comes from thinking that script tags should be usable in an ES module system, but I think that is not the best way to look at script src tags in a module system. Some notes on that:

Script src tags are separate primitives for loading script and evaluating it. Consider it an XHR call with an eval() in global script space, and extra semantics for blocking rendering. Dependencies are expressed via a linear global list of script src tags.

In contrast, modules are about units of code that specify their dependencies not with an implicit global identifier but with a local identifier tied to a string reference, where that string reference allows some level of indirection and resolution.

Sometimes that indirection means translating that ID to a URL to load, sometimes it means just grabbing that module definition by the normalized string from a registry populated by bundles modules. Modules are a tree structure that (to work in the browser) are asynchronously discovered.

That async discovery of layers in a tree model fundamentally does not work well with the linear, and sometime render-blocking, model of script src tags. There is a set of problems when using an AMD loader alongside manually coded script src tags that are just avoided by not using the script src tags at all. The conceptual models are just different, and there will be less confusion by keeping them separate.

@MajorBreakfast
Copy link

@jrburke Your last post is a bit verbose. Just say everything once, please.

Part 1

I can see a loader design that perhaps does not specify a locate hook, with fetch() taking care of that internally

Fetch does nothing more but loading.

code units should just be something like an identifier, not a URL

It is an identifier. (What else would it be?) Also, one of the the most frequently used ones.

the provider may not be at an actual URL

It might not be a web server. But URLs work for all kinds of sources (e.g. android content providers). Just use a custom schema.

Part 2
That is supported and yes relative module paths are useful.

Part 3
SystemJS can load modules that export a global; so @guybedford is definitely aware of these points. As you say script tags are something different to modules. But so are URLs and script tags. URLs are an essential building block of the web. So, I say that this won't cause confusion because we already use them for all kinds of things. However, "Having two ways to refer to the same module is a bug waiting to happen" (from @guybedford's post above).

@jrburke
Copy link

jrburke commented Jun 3, 2015

Part 1
It is an identifier. (What else would it be?) Also, one of the the most frequently used ones.

I meant more like the JS language concept of Identifier. Sorry for not making that clear.

It might not be a web server. But URLs work for all kinds of sources (e.g. android content providers). Just use a custom schema.

URLs are great for global systems, like addressing content across an OS, that need disambiguation, android content providers being one of those. However, module use is more about local context, and it is more useful to have names that represent contracts than actual URLs. My 'jquery' could actually be provided by 'zepto'.

It is good if a dependency indicates via a globally resolvable identifier (like a URL) in the package.json where to fetch a a local identifier dependency if the local project does not have an opinion, but also allow the local project to override that with a preferred local value.

There is a very strong analogy with custom elements in HTML, where they are tag names, but the resolution of that name to provider is separate from that name, and each tag may not be delivered by individual URLs.

Part 2
That is supported and yes relative module paths are useful.

This was in response to @guybedford's original post section on Dot Normalization. The duplication is not really about a path duplication, but about being able to reference different concept parts in a localized way, without knowing the global ID, and if it looks like duplication, it is more about flattening the number of transforms from ID to path (when a path is required for fetching) than being specifically a URL concept.

Part 3
URLs are an essential building block of the web.

They are, because they help disambiguate across a global space, across all web sites. It does not follow though that it is therefore what a module loader needs to use as internal keys. And given the use cases of module bundling, loader plugins IDs that are not URLs, and even other examples in the web space, like custom elements, it is much more flexible to keep the URL concept solely related to when fetching is needed, when global disambiguation is needed (like, the browser cache is global across all web sites, so a URL is needed in that case).

@johnjbarton
Copy link

it is much more flexible to keep the URL concept solely related to when fetching is needed,

Is every module registered the result of fetch()? So that every entry in the registry corresponds to a URL used in fetch()? Not using this equivalence would seem to make implementation more confusing.

Doesn't the issue of "module ids" vs URLs relate to module specifiers rather than the registry keys?

@matthewp
Copy link

matthewp commented Jun 3, 2015

Is every module registered the result of fetch()? So that every entry in the registry corresponds to a URL used in fetch()? Not using this equivalence would seem to make implementation more confusing.

No, the loader.install (previously loader.set) API allows you to create entries arbitrarily from any object in memory. This is where URLs as keys really breaks down, modules are not all files.

@MajorBreakfast
Copy link

My 'jquery' could actually be provided by 'zepto'

@jrburke In this case jquery normalizes to the URL of zepto. That URL can then be used as key.

@johnjbarton
Copy link

This is where URLs as keys really breaks down, modules are not all files.

Well URLs do not identify files so why does "modules are not all files" relate?

@matthewp
Copy link

matthewp commented Jun 3, 2015

Well URLs do not identify files so why does "modules are not all files" relate?

They do identify a resource that can be fetched, though, I thought this is why you asked the question. If all modules relate to a URL it might make sense (in some aspects) that they be the key, but since they do not all relate to a URL it makes much less sense.

@johnjbarton
Copy link

If the developer who calls loader.install() wants to avoid calling loader.fetch() for any modules they have already loaded, then the key they use for install() has to match the keys used in fetch() for any module with identical content. This is the point @guybedford made in his introduction

trying to make a custom schema work alongside URLs in the same registry space, ends up causing more issues, for no practical gain.

This does not mean the keys must be URLs, it just means that the keys should represent the modules not the specifiers or a mapping mechanism.

As @jrburke points out the arguments about URLs being fundamental etc don't really make sense. But URLs are approximately unique and approximately represent content. We can't actually do better than approximate here because the content of module definitions can change at arbitrary times on servers far away. We have a lot of experience with how URLs fall short. We have tools to deal with them.

I guess the handful of folks who deal with this level of the API will able to deal with URLs by just treating them as strings even if they have some drawbacks (verbose, obtuse, vague).

(and I'm not even sure how this related to locate() since it seems to me this function was an arbitrary split in the old pipeline).

@jrburke
Copy link

jrburke commented Jun 3, 2015

As @jrburke points out the arguments about URLs being fundamental etc don't really make sense. But URLs are approximately unique and approximately represent content. We can't actually do better than approximate here because the content of module definitions can change at arbitrary times on servers far away. We have a lot of experience with how URLs fall short. We have tools to deal with them.

To reiterate my main point, then to hopefully stop posting in this ticket:

URLs are best for when things really need to be uniquely disambiguated in a global space. Browser cache across multiple contexts (pages) is one of those. Package managers installing code from a global space into a local context is another. In the package manager case, it is specified outside the module loader system, in package metadata, only comes to play on initial install to local project context.

For module keys, I do not believe URLs are the right model. Better ones are the function Identifiers used in JS, or custom element IDs, both conceptual names that need to be unique within a scope. They do not need URLs to work -- the fetching (if needed) of the code that backs those IDs are a separate concern.

Strings are used to refer to modules instead of JS Identifiers because the IDs need some expressiveness outside regular JS Identifier punctuation, and it helps indicate that these are special in their resolution, do not require that the value at that identifier is completely defined when they first are encountered.

Similarity to a string path or URL in some cases are just for flattening some of the translation logic if fetching is needed, but fetching is conceptually a separate concern from modules' unique IDs in a loader instance.

@guybedford
Copy link
Author

@jrburke no one is arguing that portable names are not desirable. Of course they are. The argument is simply that storing portable names in the registry itself is not a good idea.

To summarize my initial argument:

  1. Having a locate hook allows portable names to be stored in the registry itself.
  2. We do need to support URLs loaded as modules and stored in the registry.
  3. Managing portable names and URLs together causes a lot of issues as we need to handle all the edge normalization cases - URL normalized against a module name, configuration of URLs together with configuration of portable names etc etc
  4. I don't think the spec should encourage implementors to have to deal with all these issues as there are so many difficulties.
  5. Rather encourage implementors to use URL-space as the primary space mapping all names there.
  6. No one is stopping implementors from using portable names within implementations, but they are simply not the unit of abstraction of the loader itself.
  7. No one is stopping other names from being saved into the registry either - the key point here being that these modules have no dependencies so never have to deal with how to be normalized against or configured in naming spaces.

To summarize my proposal use portable names, but normalize them into URLs for the environment. This is completely working from bundling to loading in SystemJS and jspm now, being released as of tomorrow.

@jrburke
Copy link

jrburke commented Jun 3, 2015

  1. We do need to support URLs loaded as modules and stored in the registry.

I believe this the part is causing the trouble. I did that too in requirejs, and it led to bad expectations, particularly around build bundling and trying to reference IDs outside the top of the namespace, then expecting that to be addressable as a module later for bundling.

I assumed I needed to do it to allow people to transition from script src tags, but it was the wrong choice, script src tags are really a different thing than modules, and where I can make new AMD loaders that break compatibility with requirejs, I will just enforce using portable names.

So I suggest whatever reasons might be driving wanting to load URLs directly as a module for your case should be revisited. It would be great if those reasons were illustrated somewhere, maybe I can learn from them, but based on previous thread comments it seems like it is not readily available.

@guybedford
Copy link
Author

I believe this the part is causing the trouble. I did that too in requirejs, and it led to bad expectations, particularly around build bundling and trying to reference IDs outside the top of the namespace, then expecting that to be addressable as a module later for bundling.

@jrburke exactly, it is these difficulties trying to make URLs and names work together that leads to my argument that we store URLs in the registry.

This entirely comes from wanting URLs to be permissible. I believe this is a pretty set requirement and I think the point is that we can't "ignore" URLs when writing for the web, but @dherman or @caridy may have better arguments for the decision here.

Note that it is very simple given a URL to work out its portable name again for bundling for example.

@matthewp
Copy link

matthewp commented Jun 4, 2015

@guybedford you said this has been implemented in SystemJS, where? master doesn't appear this way.

@matthewp
Copy link

matthewp commented Jun 4, 2015

One caveat to this - we still allow special modules to be stored in the registry like @empty or @loader. The way we handle this in the latest SystemJS is to not URL-normalize a module if it is a plain name and already in the registry (which can be determined by a simple lookup). This way special environment-independent system modules can still be defined and used. What makes this an exception though is that these modules never get used as a parentAddress to normalize against due to their nature of being set in the registry and not loaded into the registry.

This restricts the usage of .install drastically. Essentially you can only use it before 1) anything has been imported or 2) If you can trap it from being imported in resolve. This is related to #48. I have deployed code that expects to be able to call .install after normalization is complete, I can't use this code now, it seems.

@matthewp
Copy link

matthewp commented Jun 4, 2015

Ok @guybedford so I understand how your new code works now. So couple of problems I spot:

  1. Plugins don't go through normalization, only a subset of normalization that you know to be sync: https://github.com/systemjs/systemjs/blob/b299e4f46dbf5dcaccdb26c7ea37ac8fd10f1c41/lib/plugins.js#L53

You can't use plugins that use an extension, for example system-npm, you couldn't do foo!bar where bar is an npm package now.

  1. To fetch plugins this only works today because normalize and locate are separate: https://github.com/systemjs/systemjs/blob/b299e4f46dbf5dcaccdb26c7ea37ac8fd10f1c41/lib/plugins.js#L112

This won't work any more once there's only a key. You'll have to split on ! and call the parent normalize with the plugin argument. This breaks the contract of a loader hook and can have bad side-effects (if there's another fetch hook expecting real keys).

@matthewp
Copy link

matthewp commented Jun 4, 2015

@guybedford You didn't explain how you plan to make URLs portable. Can you explain this? Looking at your build code it looks like you attempt to do a reverse normalization: https://github.com/systemjs/builder/blob/67754eee48f3604233fd9ad005240801f693d9c2/lib/builder.js#L120

This assumes knowledge of how the original normalization took place and won't work otherwise. Correct me if I'm wrong, but I think the only way to make a truly portable schema would be to have something like normalize take place.

There are still unanswered questions about making the Loader spec fast enough that http2 is viable, so I don't think we can simply ignore the bundling need. I'd love to hear how portability can be achieved without normalize though.

@guybedford
Copy link
Author

@matthewp to quote the original post:

We still keep our schema if we like - which we can bundle into just the same:

System.register('custom:portable/schema', ...);

Where the name above name is normalized into a resolved name of http://www.site.com/packages/custom/portable/schema.js by the loader when being processed and stored in the registry (bundle names are now treated as unnormalized).

That is, yes exactly, we reverse from URLs into our portable schema at build time, and have bundle names normalized by the loader.

The alternative is to effectively do this in the loader itself, but the issue as I mentioned is that we convert the schema to URL back to schema in normalize just to handle the URL normalization problems, which doubles up work unnecessarily. Hence the argument URLs as the environment-reference, which can easily be converted back and fourth as needed but the point being as needed.

@matthewp
Copy link

matthewp commented Jun 4, 2015

That is, yes exactly, we reverse from URLs into our portable schema at build time, and have bundle names normalized by the loader.

You cannot reverse it though. There's no algorithm for doing so. It works in your case because you are making assumptions about how normalization originally occurred. This is not sufficient for generating loader-agnostic portable names. Normalize is.

@guybedford
Copy link
Author

Yes any type of schema names are specific to the implementation in question, and the responsibility of the system imposing the schema. The rule in jspm is currently simply SystemJS wildcard paths configuration.

The argument is not about what is necessary for portable names (and with a working implementation I I've shown URLs in the registry work completely fine), it's about what is necessary to permit URLs and schemas together in a module loader.

@matthewp
Copy link

matthewp commented Jun 4, 2015

The argument is about whether a locate hook is necessary. If the solution for creating portable names requires something like normalize then I think that's a strong argument that we should just keep normalize. Your working implementation makes unreasonable assumptions that won't scale.

@guybedford
Copy link
Author

Just to clarify again here from the last two comments for the record, the argument is simply:

How to permit URLs and custom schemas together in the same module naming system?
=>
Make URL-space the target normalization space
=>
Locate hook is unnecessary as normalize already returns URLs

Of course the locate hook can be allowed and implemented, in turn allowing reverse-normalizing from URL-space back into schema-space at the end of normalize for custom schemas to be used in the registry. But we'd need a good reason to do this as it is unnecessary work otherwise. So in SystemJS we're moving along the path of not having locate to test this out properly. If we hit an issue that shows this to be a terrible idea we can reconsider, but that is yet to happen so far.

@matthewp
Copy link

. If we hit an issue that shows this to be a terrible idea we can reconsider, but that is yet to happen so far.

Well so far you had to gimp your plugin system and kill your extension ecosystem. Maybe you don't think that counts as a "terrible idea" but I certainly do.

Your builder now won't work if the user is using custom extensions (maybe it never did, but ours did).

It is very easy to make a build tool that works with any WhatWG Loader if you have normalize. Without it it's not, since the whole point of this change is to remove the ability to identify modules outside of what loader they are running in.

By the way, SystemJS is still using the locate hook such as here. When you finally do get rid of locate you're going to run into the problem described here with plugins since you're going to have to call fech with a non-key in that case (or any case where you want to fetch a module whose key is not a real url).

@guybedford
Copy link
Author

@matthewp what is a use case for fetching a module whose key is not a real url? For things like core modules, we can permit non-URL names from normalize via the first line of normalize being:

function resolve(key, parent) {
  if (loader.has(key))
    return key;
}

That enables things like core modules like @math/... etc.

@matthewp
Copy link

what is a use case for fetching a module whose key is not a real url?

plugin resources.

robwormald added a commit to robwormald/angular that referenced this issue Oct 6, 2015
upgrade systemjs-builder to 0.14.x

BREAKING CHANGE:

URLs are now first-class module names. All names are normalized into URLs into the registry as part of the normalization process. See https://github.com/systemjs/systemjs/releases/tag/0.17.0 and whatwg/loader#52 for more details.
@caridy
Copy link
Contributor

caridy commented Dec 1, 2015

@guybedford can we just close this now that we have merged PR #97?

@guybedford
Copy link
Author

@caridy sure, and thanks for your excellent work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants