Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: split embark into embark and @embark/core #1048

Open
michaelsbradleyjr opened this issue Nov 7, 2018 · 8 comments
Open

Proposal: split embark into embark and @embark/core #1048

michaelsbradleyjr opened this issue Nov 7, 2018 · 8 comments

Comments

@michaelsbradleyjr
Copy link
Contributor

michaelsbradleyjr commented Nov 7, 2018

Overview

TL;DR

Module resolution with DApps and Embark is presently complex. See #1039, for example. If the bulk of Embark were contained in a locally installed @embark/core package, and if we make use of peer dependencies, the situation can be greatly improved. Later on, we might split up @embark/core into a collection of locally installed @embark/[name] packages, and that will work fine too.

Extra Detail

Key ideas

  • embark becomes a very lightweight package that can be installed globally or locally, and which provides a minimal cli.
  • @embark/core provides most of what embark provides now, minus the pieces in the revised embark.

minimal embark cli

It would have:

  • Template generator
  • Logic similar to the existing shim that looks for @embark/core, hooking into it where it finds it, or prompting the user to install it if wasn't found. It could even offer to install it for the user and update the DApp's package.json.

When run outside a DApp directory, the help output might look like this:

$ embark
Usage:  [options] [command]

Options:

  -V, --version                              output the version number
  -h, --help                                 output usage information

Commands:

  new [options] [name]                       New Application
  demo [options]                             create a working dapp with a SimpleStorage contract
  version                                    output the version number
  help                                       output usage information and help information

When run inside a DApp directory, it will bolt onto @embark/core's cmd.js and the help output would show the additional cli facilities that @embark/core provides (everything that embark help displays at the present time). Also, version output inside a DApp would include not only the version of embark but also of @embark/core.

@embark/core

It would consist of most of what's in embark presently, minus the parts that are in the revised package, such as bin/embark, the template generator, etc. (see the previous section).

It would be intended to always and only be installed locally. If it's installed globally, it won't cause any problems, but it would be inert and not usable by DApps. It's the same situation if you installed the bluebird package globally (npm i -g bluebird) and then want to use it from a Node project you're developing: it doesn't work that way; you need to npm i bluebird inside your project.

An important change is that any Embark dependencies that may be resolved by a DApp script or DApp dependency will need to be either:

  • Converted to a peer dependency of @embark/core and installed alongside it. The revised embark cli-package can help with this — after finding/installing @embark/core it can check whether any peer dependencies specified by @embark/core are not yet installed or if the already-installed versions aren't acceptable. @babel/core is an example of a package that would become an @embark/core peer dependency.
  • Resolved indirectly by having the DApp script require an encapsulating module within embark. An example of a package that could be encapsulated inside an embark module is lodash.clonedeep. Assuming the DApp script is run in the embark process or a child process, then something like this could be done: const {cloneDeep} = require(path.join(process.env.EMBARK_PATH, 'dist/lib/core/utils')).

Why am I proposing this?

If you're not 100% sure how module resolution works in Node, please read the docs.

There are presently three+ distinct node_modules trees in play between a DApp and Embark:

  1. A/embark/node_modules
  2. B/dapp([/..]*)/node_modules
  3. B/dapp/.embark/versions/*/*

(3) isn't really a node_modules tree per se, but it works like one. (2) could be a single node_modules tree or a forest of trees that will be searched from inside out.

Depending on what's installed where, A might be in a subdir relationship with B or vice versa, or they may be the same or disjoint.

Presently, when a DApp script or dependency needs to resolve a module from Embark's node_modules, we ensure that can happen by modifying NODE_PATH, and in some cases we also have to use a "custom require" approach since our runtime modification of NODE_PATH can't take effect in embark's main process (only child processes).

These solutions work but are brittle and can have surprising results. And since they're implemented at runtime, npm may report a DApp is missing dependencies even though Embark can be expected to supply them.

How does this proposal help?

With @embark/core always in the DApp's node_modules tree/forest, and with select dependencies specified and installed as peer dependencies at the top-level of the tree in which @embark/core resides, we eliminate the need for NODE_PATH and "custom require", and we can be certain that Embark and the DApp will be able to resolve the package versions they need.

This same approach should also let us confidently deprecate Embark's library manager, and therefore eliminate (3) in the list above. Any package that might be specified in a DApp's embark.json with a non-default version could be specified as a peer dependency of @embark/core with a suitable semver range , allowing a DApp developer the freedom to specify the non-default version he wants in the DApp's package.json and to depend on npm to install it in the usual way.

The reason that it's critical for us to switch some packages to being peer dependencies is that transitive dependencies (if we moved from shrinkwrap back to package-lock for @embark/core) can be unpredictable. For one, there can be semver shenanigans with dependencies' dependencies when @embark/core is installed from the registry. Also, there's no guarantee that some other dependency of a DApp (or the DApp itself) won't specify an incompatible version of a dependency also specified by @embark/core — npm might (or might not) dedupe the incompatible version into the top-level node_modules of the DApp, and then DApp scripts (such as an ejected webpack config) and DApp dependencies might end up broken at runtime.

Note that the reason we'd need to switch from shrinkwrap back to package-lock to take advantage of @embark/core's dependencies as transitive dependencies is that shrinkwrap will keep all of them in dapp/node_modules/@embark/core/node_modules such that normal module lookup from within the DApp can't succeed... and so we'd be right back to needing NODE_PATH and "custom require". 😱

@michaelsbradleyjr
Copy link
Contributor Author

michaelsbradleyjr commented Nov 8, 2018

One thing not mentioned in the write-up above is that if a DApp were to specify a dependency that itself had a peer dependency incompatible with one of @embark/core's peer dependencies, then something would have to be modified — either Embark or the DApp or the DApp's dependency. I'm not sure how often that situation would arise, and we'd want the embark cli to be as helpful as possible with identifying those situations.

@alaibe
Copy link
Contributor

alaibe commented Nov 8, 2018

Thx for making that great idea something we can all discuss formally about.
I would personally go even further via the following suggestion:

  • Fully namespace the packages:
  • @embark/cli
  • @embark/scripts (more on this name later)
  • @embark/cli only responsible to generate a app and is never use after that
    Contains only new command with template generation

  • @embark/scripts located in package.json
    Contains all the other commands
    If the scripts is not inside the local package.json and use globally => throw an error with an explanation message
    Later on, the scripts package can be splitted by module/core and this package can only be the entry point to embark world
    Example of package.json

{
  "name": "app",
  "scripts": {
    "embark": "embark-scripts",
  },
  "dependencies": {
    "@embark/scripts": "^5.0.0",
  }
}

From there you can run:
npm run embark command or yarn embark command

  • Advocate strongly for npx or npm init or yarn create when using @embark/cli to avoid global install

  • Remove the part where the cli interact with the core

I think that would reduce the number of case we have to handle and help the user to follow not only embark convention but also JS convention.
Those techniques are used by Create React App, Create Next App, Vue.js to only name the most famous one

@0x-r4bbit
Copy link
Contributor

0x-r4bbit commented Nov 8, 2018

Really really nice to see this discussion here. Let me throw in my thoughts as well:

  • I actually think that the whole package naming is a separate problem that needs to be solved and discussed, but it definitely affects the consumer facing API in this effort as well. Even though, we might want to discuss this in a separate thread, I believe we will and should end up with packages such as:

    • @embark/core
    • @embark/cli
    • @embark/plugins/<plugin> ?
    • @embark/<package> // packages that are needed but not necessarily a plugin either

    Ideally, @embark/cli which is our CLI would also just take advantage of @embark/core and any other plugins and packages it needs to do its work. So do other consumers that would for example want to use what's currently called EmbarkJS. Maybe they'll rather do sth. like import { embark } from '@embark/client' in the future.

    But again, I think those namespacing related things are really a separate problem.

  • While I do think that a create-react-app equivalent might be a nice thing to have, we shouldn't forget that Embark CLI does way more than just scaffolding a new app. Yes, this might be tackled by @alaibe 's proposal of @embark/scripts, but then again, why can't we keep it "simple" (all dependency issues aside for a moment), and just have @embark/cli a local dependency of an embark project?

    At the end of the day it should be possible to do ./node_modules/@embark/cli/bin/embark <command> IMO.

    If one doesn't feel like introducing @embark/cli globally, just to create a new app, they can just do npx @embark/cli new [--options]. I see @embark/scripts more like whatever modules and flows are needed inside embark (CLI) to make the thing work.

Presently, when a DApp script or dependency needs to resolve a module from Embark's node_modules, we ensure that can happen by modifying NODE_PATH, and in some cases we also have to use a "custom require" approach since our runtime modification of NODE_PATH can't take effect in embark's main process (only child processes).

@michaelsbradleyjr When would a Dapp ever try to resolve a module from Embark's node modules? The only thing I can think of right now, is that a Dapp wants to import from Embark (but not its node modules). For example, something I've mentioned above:

import { embark } from '@embark/client'

But also in this case, @embark/client would be a package installed as a dependency of the app itself.

This same approach should also let us confidently deprecate Embark's library manager, and therefore eliminate (3) in the list above. Any package that might be specified in a DApp's embark.json with a non-default version could be specified as a peer dependency of @embark/core with a suitable semver range , allowing a DApp developer the freedom to specify the non-default version he wants in the DApp's package.json and to depend on npm to install it in the usual way.

I'd love that. I still don't fully get why we maintain another set of versions inside the dapp.

So bottom line:

I think the most straight forward, from a developer experience point of view, is if you have to only care about one thing: you either have @embark/cli installed globally or not. Either way, the local version will be used to run your app.

That's how Angular CLI does it as well, I'm sure we can get some inspiration from their code base.

@iurimatias
Copy link
Collaborator

I don't really agree with this. It feels premature given other stuff we still need to do, overcomplicates things possibly (although I might be misunderstanding..), but more importantly in the grand scheme of things we have other priorities first.

@michaelsbradleyjr
Copy link
Contributor Author

michaelsbradleyjr commented Nov 9, 2018

Later today I can provide a more thorough reply to all the points raised so far.

To @iurimatias's point, here's the problem:

The cli shim that's landing in embark v4 is supposed to make it painless to have embark installed locally or globally, or both, and have everything always work regardless of a DApp author's preferred setup.

The shim does work well, when it comes to embark commands. It's only recently become clearer to me that there are still serious challenges regarding module resolution. Our use of NODE_PATH and "custom require" is prone to surprising behavior and needing additional work arounds as time goes on. My confidence is rather low right now that it's okay to tell developers they can install embark locally if they wish — it may not work correctly.

This proposal is a way to move forward with a setup that we know will work correctly, so long as we're careful about specifying some dependencies as peer dependencies (now, and in the future as embark evolves). It also has the benefit of laying the groundwork for breaking embark up into a set of smaller library modules. That should help make embark more flexible, allowing the components to be used in different ways, e.g. with alternative cli interfaces. Some community members have been asking about that, I think... I'm recalling discussions about angular cli and embark.

@michaelsbradleyjr
Copy link
Contributor Author

michaelsbradleyjr commented Nov 9, 2018

@alaibe good proposals! Here are some ideas that build on yours:

Instead of @embark/cli to house what's in utils/template_generator.js, call it @embark/new or @embark/create.

Rather than have an @embark/scripts package, I propose that each embark module that offers cli functionality can have the equivalent of cmd.js and cmd_controller.js. It would be the job of the @embark/cli module to take a list of modules and build the full cli at runtime. I'm thinking of modules like @embark/blockchain exposing the embark blockchain command, @embark/pipeline exposing the embark build command, and so on. We could have some meta-modules to expose commands like embark run and embark reset. @embark/core could be a meta-module and its own cmd.js / cmd_controller.js could supply run, reset, and so on.

Importantly, none of the individual modules would directly expose anything to the command-line, i.e. no "bin" entry/ies in their package.json files. Nor would @embark/cli have a "bin" entry — it would just be for cli plumbing, i.e. building the full cli from all the pieces.

More concretely...

embark package (not namespaced) could have @embark/cli and @embark/new as dependencies. When its bin/embark is run outside a dapp it would pass '@embark/new' (which exposes new and demo commands) to @embark/cli.

When the embark package's bin/embark is run inside a dapp something special happens: bin/embark having detected it's running inside a dapp (it found embark.json) it checks that the package.json it found for the dapp specifies @embark/core as a dependency and that it's installed. Assuming it's ready to go, its passes a list of modules indicated by @embark/core ([ '@embark/blockchain', '@embark/storage', '@embark/core', ... ]) to @embark/cli. @embark/core would have @embark/blockchain, etc. as dependencies, so they're ready to go as long as @embark/core is installed.

The embark package could be installed globally or locally, it would work either way for running its bin/embark; we can still support the shimming behavior, which is a distinct behavior from hooking into @embark/core.


I'm sure that all sounds complex, but here are the key benefits:

  • Only the embark package exposes a "bin" entry.
  • bin/embark of the embark package could be used globally (embark run) or locally (npx embark run, npm run embark run), or global can shim to local, meeting the same goals we had before.
  • If @embark packages are mistakenly installed globally nothing bad happens, but they're not usable in that way since they don't provide any "bin".
  • Globally installed embark would have @embark/cli and @embak/new as dependencies, but that's a special case of global @embark and opaque to the user.
  • If someone wants to use @embark modules from a completely different cli or programmatic interface, they can! They can write that interface to hook to any number of @embark modules' cmd.js / cmd_controller.js scripts — they could do that directly, or make use of @embark/cli, depending on what best suits their use case.

@michaelsbradleyjr
Copy link
Contributor Author

michaelsbradleyjr commented Nov 9, 2018

@PascalPrecht agreed on nearly all points re: namespacing. See my previous comment, will definitely appreciate some feedback.

A big gain of the approach I outlined is that it would be possible to do npx embark new [--options] and it would be super fast and efficient. You can do that right now, but the temporary-install step at the beginning of npx embark new makes it impractical because all of embark has to be installed. If the bulk is carved off into @embark packages, then the situation will be much improved.

I think keeping embark (not namespaced) as the sole "door" to the cli facilities implemented in the @embark packages would have a lot of benefits. From an Embark user's perspective:

[npx | npm run] embark [cmd] [--options]

^ that will simply offer what he wants inside or outside a DApp, regardless of whether embark has been installed globally or locally. The plumbing to @embark modules and peer dependencies inside his DApp would be an implementation detail that's behind the scenes.

@michaelsbradleyjr
Copy link
Contributor Author

michaelsbradleyjr commented Nov 9, 2018

@PascalPrecht to answer your question about resolving modules:

In the Embark 3.1.x and 3.2.x series and in the 4 alphas (not sure about <=3.0.x), DApps might include a .babelrc enabling babel plugins/presets listed in the DApp's dependencies, which sometimes (not always) rely on Embark to provide those packages with peer dependencies such as @babel/core. In the 3.1.x series, there were some hacks (mysterious double-runs of webpack, process.chdir() games) that allowed webpackProcess to find modules inside both the DApp and Embark. babel-loader is an example of a module inside Embark, while babel v6 plugins might have been supplied by the DApp.

Starting with 3.2.x, things got more complicated with the ejectable webpack config, which expects Embark to provide it with the lodash.clonedeep and glob packages, as well as a bunch more (compared to 3.1.x) babel related things. When the webpack config is inside embark's file tree, that's mostly[+] not a big deal because node's lookup procedure starts at __dirname of the script. BUT when the webpack config is inside the DApp... what to do?? That's where NODE_PATH and later "custom require" (old, new) come into play.

[+] "mostly", because even a non-ejected config can have problems if a DApp is using a babel plugin that needs @babel/core and the DApp is expecting Embark to provide it — in that case NODE_PATH is still needed whether Embark is install globally or locally, in the latter case because of shrinkwrap.

While the NODE_PATH and "custom require" workarounds are primarily serving the needs of the webpack config, the same/similar workarounds would be necessary if there's ever another case of an "ejectable" script, i.e. one that might run from within embark's file tree, or maybe from within the DApp's file tree. Also, DApp authors in the wild might start relying on things in Embark's node_modules as pseudo-transitive dependencies precisely because we're using NODE_PATH and therefore making them available to the DApp. If we upgrade any of those dependencies, a DApp might break if we upgraded to a version that's incompatible; that is, we may have dealt with the upgrade on our end (in Embark's internals), but for the DApp author it would be a surprise. If we're not careful, we might even introduce that kind of pseudo-transitive dependency relationship in our own templates and test apps without being fully aware of it.

Another case when a DApp can try to indirectly resolve a dependency from Embark's node_modules is related to the "versions" section of a DApp's embark.json. That situation is a little different, and it can flip around such that Embark directly resolves a dependency from a DApp's .embark/versions directory.

If we step back and consider the bigger picture, all of these connections can be better expressed as peer dependency relationships between a DApp and Embark. For example, glob and lodash.clonedeep could be encapsulated in an @embark/support package that's listed in the "peerDependencies" of @embark/core's package.json. Likewise, @babel/core and other babel-related things could be listed in "peerDependencies".

That will work great, but we'll need both @embark/core and all the peer dependencies to be installed in the DApp's node_modules in order for the DApp and Embark to be able find them without resorting to NODE_PATH and/or "custom require".

@0x-r4bbit 0x-r4bbit added this to Current iteration in Embark Nov 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Embark
  
Current iteration
Development

No branches or pull requests

4 participants