Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify loading metadata #37

Open
matthewp opened this issue Mar 12, 2015 · 26 comments
Open

Specify loading metadata #37

matthewp opened this issue Mar 12, 2015 · 26 comments

Comments

@matthewp
Copy link

In the old spec this was useful as a way to share information about a module between loader hooks (SystemJS for example had a load.metadata object). You can maintain your own hashtable for this but this is inherently less shareable. It was also nice just for the simplicity of every hook having the same signature (except normalize).

Was this intentionally removed for a particular reason or was it just not thought to be needed?

@guybedford
Copy link

@matthewp as far as I'm aware, a private metadata object will still be part of the spec as a final argument to the hooks, it just hasn't been written in yet.

@matthewp
Copy link
Author

That's good, I still think load is a nicer signature than N arguments depending on the hook.

@dherman dherman changed the title Consider bringing back the load object Specify loading metadata Mar 30, 2015
@dherman
Copy link

dherman commented Mar 30, 2015

@guybedford In #24 you suggested that it'd be nice for the resolve hook to get passed the metadata, and for the metadata to be passed in to the resolve method but I'm not sure I understand how client code could actually make use of this, given that only one metadata object can win, and that can only be determined after resolution completes. What are the uses of a preallocated-but-possibly-discarded metadata object being passed in to resolve?

@guybedford
Copy link

@dherman consider that the majority of understanding about a module occurs in resolve - as part of performing normalizations lots of metadata / state gets put together in the process, that is basically what we want to be available to the subsequent hooks. The metadata is discarded and useless if it wasn't the first to resolve to the given name, but the metadata that remains stops us from having to maintain a custom metadata cache or reprocess on fetch.

@matthewp
Copy link
Author

Agree with @guybedford here, metadata is important for normalization. Plugin detection happens at this step, for example.

@dherman
Copy link

dherman commented Apr 2, 2015

@matthewp I talked with @guybedford on IM about this. I think I understand what you want: the ability to take information you gleaned from the name (and may have stripped out of the normalized name) and save it for later in the loading process. But passing in an object that may be the wrong object is not a good mechanism for achieving this. I will try to come up with something that avoids unnecessarily allocating objects and handing misleading objects to the resolve hook.

@matthewp
Copy link
Author

matthewp commented Apr 2, 2015

That might be a sign that resolve is doing too much. Maybe we should split it up into 2 separate hooks.... I think I have good names for them already. :-)

@dherman
Copy link

dherman commented Apr 2, 2015

If you don't need the unresolved name, then why aren't you just doing this work in the locate hook?

@matthewp
Copy link
Author

matthewp commented Apr 2, 2015

There is no locate hook. http://whatwg.github.io/loader/#reflect-load-hook

@dherman
Copy link

dherman commented Apr 2, 2015

LOL -- I meant the fetch hook :)

@dherman
Copy link

dherman commented Apr 2, 2015

IOW, there are two different possible issues here. One is that you might just want to conceptually tease out two different steps, with no real increase in power -- and that would suggest bringing back the locate hook.

But the other is that you're actually wanting to save metadata based on information that you erase from the non-normalized name -- and for that, a locate hook wouldn't help you, because it only receives the normalized name. This use case is a bit fishy, since different non-normalized names might resolve to the same normalized name but have conflicting metadata from the erased parts, but I can see it happening in practice

For example, you could have a plugin mechanism where "es6!foo" and "es7!foo" both resolve to "/foo.js" but they are requesting conflicting plugins. Now, the plugin system could choose to enforce consistency by treating that as an error. (IMO this is a flaw in the AMD plugin system -- you should configure that stuff in one place, rather than having to repeat yourself everywhere and try not to have conflicts.) So while this isn't to my taste, I can imagine it happening.

Is that the kind of power you're wanting? Am I wrong that the locate hook wouldn't actually help you?

@matthewp
Copy link
Author

matthewp commented Apr 2, 2015

I think maybe you're thinking about functions within a hook as competing with each other when in reality they form a chain where each does it's own little small part and then calls the parent, and so on. Yes, at any point within the chain a function could exit out. But that is the exception, in my experience.

@dherman
Copy link

dherman commented Apr 2, 2015

I think you're missing something important. It's only a linear chain after resolve. The resolve hook is a "fan-in" to a chain.

"./foo.js"  "foo"  "../../../foo.js"
     \        |        /
      \       |       /
   resolve resolve resolve
        \     |     /
         \    |    /
          \   |   /
           \  |  /
            \ | /
             \|/
              |
            fetch
              |
          translate
              |
         instantiate
              |

@matthewp
Copy link
Author

matthewp commented Apr 2, 2015

I'm talking within the same hook. Consider:

import foo from "es7!./foo";
  1. The addJs extension adds a .js to the name.
  2. The plugin extension will split on !, load the es7 module (which itself will go through resolve and become /es7.js).
  3. The plugin will receive ./foo.js and do its own thing. Maybe all es7 implementations are under a /es7 folder so it resolves to /es7/foo.js.

These 3 steps are not competing with each other. There is no "winner".

I agree that if there were a locate hook there would be no need for metadata within normalize. The plugin extension would add its metadata in the front of the locate step.

@dherman
Copy link

dherman commented Apr 2, 2015

I think we're talking past each other...

I'm talking within the same hook. ... These 3 steps are not competing with each other. There is no "winner".

Hm, but that's not what I was talking about. The "winner" I'm referring to is where you have three different calls to resolve, one for "es7!foo", one for "es7!../foo.js", and one for "es7!../../../foo.js", each of which resolves to the same thing: "/es7/foo.js". For each call to resolve, a distinct metadata object is passed in. But only one of those metadata objects actually ends up being the metadata object that gets propagated to the rest of the chain. The others are setting properties on an object that disappears into the ether.

I agree that if there were a locate hook there would be no need for metadata within normalize. The plugin extension would add its metadata in the front of the locate step.

That's the opposite of what I was saying: the locate step does not add any interesting power here. If you would do this work in the locate step, why not just do it in the fetch step?

@matthewp
Copy link
Author

matthewp commented Apr 2, 2015

But only one of those metadata objects actually ends up being the metadata object that gets propagated to the rest of the chain. The others are setting properties on an object that disappears into the ether.

Yes, we agree here, this is not desirable, which is why normalize never had a load object I imagine.

That's the opposite of what I was saying: the locate step does not add any interesting power here. If you would do this work in the locate step, why not just do it in the fetch step?

Because by the time we've gotten to fetch we've lost the interesting bits about the module :) It's no longer a name with contextual meaning. It's too late to try and add these back to a newly provided metadata object unless you maintain your own separate hash table inside resolve.

So yes, I understand where our thinking diverged now, and I'll answer your appropriate question:

But the other is that you're actually wanting to save metadata based on information that you erase from the non-normalized name -- and for that, a locate hook wouldn't help you, because it only receives the normalized name.

The parts that are erased aren't the parts you want to save metadata about. In the case of plugins that will still be part of the normalized name. Typically normalize functions will add to the name but not remove parts (relative paths are the only exception I can think of).

@dherman
Copy link

dherman commented Apr 3, 2015

Sorry I'm still confused... :( On the one hand you're saying there's no loss of information when normalizing, but on the other hand you're saying that you've lost information required for contextualization. Also, where you say "we agree here" it sounds like you're agreeing you don't want the metadata object passed in to resolve, and yet I thought you're arguing it should be?

Maybe if you could give me a more concrete example of what specific kinds of data you want to store I'll understand better?

@matthewp
Copy link
Author

matthewp commented Apr 3, 2015

So, under the old spec there were 2 hooks, normalize and locate. Normalize produced a fully normalized name. The new spec only has resolve which produces a url. I'm saying that under the old spec there was no loss of information because of module names. We don't have module names now.

With normalize/locate you could do something like:

import jquery from "jquery";

This normalizes like:

jquery -> jquery@2.2.2#dist/jquery

The fully normalized name is jquery@2.2.2#dist/jquery

Notice there is no erasing of information as you were worried about before. The locate step will do:

jquery@2.2.2#dist/jquery -> http://localhost/my/app/node_modules/jquery/dist/jquery.js

With resolve it will just be:

jquery -> http://localhost/my/app/node_modules/jquery/dist/jquery.js

To keep each piece of the information that happened after normalization I need to maintain my own hash table because it is gone by the time we get to fetch.

Now, you might say that resolve can include this information

http://localhost/my/app/node_modules/jquery/dist/jquery.js&jquery@2.2.2#jquery/dist

And then fetch can split this information off... but I don't think this is a good solution to the problem.

@guybedford
Copy link

Note there are also use cases where it can be useful to provide metadata directly into the import function as well:

System.import('file.js', { moduleFormat: 'CommonJS' });

The alternative to the above is declarative configuration entirely, which also works, but without a canonical form to allow declarative configuration this becomes harder.

@caridy
Copy link
Contributor

caridy commented Dec 24, 2015

Update

[[Metadata]] is an internal slot on ModuleStatus objects (registry entries), but as today, there is no way to set them, and no way to inspect them, we could easily add that if we figure this is important.

The two camps are:

1 - in user-land, you can have a map to store meta information about the modules you own.
2 - if the metadata can be used cross hooks (when doing AOP), it is hard to pass it around. Also inspection, and other goodies that will analyze the dep graph can access some metadata as well.

If we decide to support metadata definition in user-land, we will have to define how set/get it.

@Constellation
Copy link
Member

Constellation commented Aug 8, 2016

Now, the metadata support becomes mandatory to support whatwg's script type="module" tag.

Quoted from #148

Looking through whatwg module spec and I think that the module-script assumes per-module-script-tree information. For example, nonce is propagated through the module script tree fetching.

However, in the current loader spec, it seems that BrowserLoader's @@fetch hook just takes the entry and the key. So the above information seems discarded through the pipeline.

And the metadata must be propagated throughout the pipeline. So (2) is necessary since the metadata is used cross hooks.

@Constellation
Copy link
Member

Constellation commented Aug 8, 2016

The easiest way is, just passing the metadata throughout the pipeline along with entry.

And the metadata can be used over multiple entrys. For example, the fetch request from the one entry should follow the nonce etc. parameters. The metadata needs to be shared (or same data needs to be propagated) in all the modules fetched in the module-script-tree.

@caridy
Copy link
Contributor

caridy commented Aug 8, 2016

this is going to be very tricky because we don't know what to share.

@Constellation
Copy link
Member

Constellation commented Aug 9, 2016

edit: Fixed. Thanks @domenic !

For whatwg <script type="module">, it seems that,

  • a settings object (It should be aded to fetch request)
  • a destination (Currently, this is always "script") (One of "script", "worker", and "sharedworker")
  • a cryptographic nonce (It should be added to fetch request)
  • a parser state ( (It should be aded to fetch request)
  • a credentials mode (It should be aded to fetch request)
  • a module map settings object
  • an ancestor list
  • a referrer (Currently, this is always "client") ("client" for top level, referring module script's base URL for non-top modules)
  • a top-level module fetch flag

need to be shared in the module-script-tree.
https://html.spec.whatwg.org/multipage/webappapis.html#fetch-a-module-script-tree

@domenic
Copy link
Member

domenic commented Aug 9, 2016

a destination (Currently, this is always "script")

Sometimes it is "worker" or "sharedworker"

a referrer (Currently, this is always "client")

For non-top-level fetches it is the referring module script's base URL


In general I would caution against paying too much attention to the model in whatwg/loader. It is still speculative and per #147 may undergo significant changes. The model in HTML is the one that is ready for implementations.

@Constellation
Copy link
Member

Sometimes it is "worker" or "sharedworker"

For non-top-level fetches it is the referring module script's base URL

Oops! Super nice, thanks!

In general I would caution against paying too much attention to the model in whatwg/loader. It is still speculative and per #147 may undergo significant changes. The model in HTML is the one that is ready for implementations.

Yeah, so we need to design the simplified loader spec that cares about this issue (per-module-script-tree-scope-metadata).
Without this, the new simplified loader spec cannot enjoy the script type="module" ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants