Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared modules/progressive apps support #2681

Open
scf37 opened this issue Dec 9, 2016 · 34 comments
Open

Shared modules/progressive apps support #2681

scf37 opened this issue Dec 9, 2016 · 34 comments
Assignees
Labels
Milestone

Comments

@scf37
Copy link

@scf37 scf37 commented Dec 9, 2016

Currently ScalaJS is not that suitable for mobile development because of monolithic xxx-opt.js.
While loading hundreds Ks of javascript on typing site URL is perfectly appropriate for many situations (B2B, desktop), mobile users won't be such as happy given their slow connections.

Google Closure Compiler allows splitting codebase into multiple modules via --module and --js arguments:
http://stackoverflow.com/questions/10395810/how-do-i-split-my-javascript-into-modules-using-googles-closure-compiler/10401030#10401030

It could be really great if ScalaJS allow to split generated javascript to multiple modules to avoid waiting for too long for first page to show. For example:

  • common (scala.* and libraries)
  • landing page
  • main page
  • settings
  • etc etc etc.

Splitting could be done by packages, by subprojects, by deps etc.

@scf37 scf37 changed the title Shared modules support Shared modules/progressive apps support Dec 9, 2016
@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 12, 2016

I have the feeling that we already have this idea filed somewhere, but I can't find it. Does someone remember where we discussed this before?

@glmars

This comment has been minimized.

Copy link

@glmars glmars commented Dec 12, 2016

This one: #1787 ?

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 13, 2016

@scf37 Some part of that discussion was relevant, indeed, but not all of it. The present issue is a neat summary of one particular core feature, so I'll keep it. The other features necessary for #1787 turned out to be already implemented. So I closed #1787 in favor of the present issue + these other implemented features.

@ramnivas

This comment has been minimized.

Copy link

@ramnivas ramnivas commented Mar 11, 2017

One more concern (besides the desired quick loading of the first page) is that using Scala.js commonjs support with tools such as Webpack (which we use currently) is basically unusable during development. When implemented with typical settings (enable source maps, for example), each time (even with a tiny change), our development builds take 40-45 seconds (3-5 second for the scala.js part and the rest for the Webpack part). This is mostly due to Webpack parsing a huge -fastopt.js for every change. We got away, for now, using the noParse option set for -fastopt.js, but that makes us lose the sourcemaps support. We could try scalajs-bundler, but I highly doubt if it is going to do any better, since at the end of the day, it too runs Webpack.

These tools are optimized for the situations where only few small source files (js, css, etc) change at a time during development. Scala.js in its current form doesn't meet this assumption (it spits out a large "source file").

This feature could promote import/export of many small modules within an app. Then each change should touch only the relevant -fastopt.js file and lower the parsing load on Webpack and other bundler tools.

@cuzfrog

This comment has been minimized.

Copy link

@cuzfrog cuzfrog commented Apr 14, 2017

Agreed. @ramnivas scalajs-bundler seems like a workaround, not a solution.

Scala.js' is fast despite single large bundle file, nevertheless, it throws the problem to npm/webpack, which have no idea how to deal with the giant, and they won't.

Modules by @JsImport should not be bundled, at first, this is work for webpack alike.

Is it possible to split the *opt.js after it's been emitted, but before webpack intervenes? If yes, how?

Thanks to tmpfs, by putting node_modules and target dir into RAM, I manage to speed up whole process about 20% compared to on SSD. Turns out webpack parsing is a CPU intensive work, and cannot be greatly optimized by this.

@aappddeevv

This comment has been minimized.

Copy link

@aappddeevv aappddeevv commented Apr 21, 2017

I just hit this issue. I need to develop a library on top of many other node.js npm modules and other js libraries and have my library usable by potentially other js libraries (which we can do today) as well as other scalajs libraries which are 1) deployed through both npm and "sbt/maven" and 2) bundled into clever bundles for web deployment. I think the current output model would replicate the scalajs standard libraries multiple times inside the module. If I could control scalajs output a bit more, without wrapping in the std lib and other dependencies, then I could use webpack, npm, scalajs-bundler and scalajs-loader to manage all the other assembly steps, it would not need to be a pure scalajs/sbt process. This could allow the ability to deploy scalajs based libraries through npm/maven and be consumed by any js/scalajs + application/library + node.js/browser + webpack/sbt combination.

@ubolonton

This comment has been minimized.

Copy link

@ubolonton ubolonton commented Jul 3, 2017

This would probably make it much easier to create a hot code patching tool similar to https://github.com/bhauman/lein-figwheel.

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Jul 3, 2017

@ubolonton No it wouldn't. You still need to link the entire app at once, and a change in one part of the code can affect the other "modules". This issue is only about the possibility to decompose the one .js output file into several .js files that can be independently loaded (not updated).

@scf37

This comment has been minimized.

Copy link
Author

@scf37 scf37 commented Sep 24, 2017

This feature should also decrease time of incremental compilation for large projects because only some bundles have to be reassembled.

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Sep 24, 2017

@scf37 No it probably wouldn't. See my comment just above yours.

@scf37

This comment has been minimized.

Copy link
Author

@scf37 scf37 commented Sep 24, 2017

@sjrd even for fastOptJS? I thought all names in fast opt mode are deterministic so why not?

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Sep 24, 2017

Yes, even for fastOptJS. The same comment applies. There are core strategic design decisions in how Scala.js compiles to JavaScript that make this impossible. Deterministic names is not the end of the story. You can look at what the Emitter does if want more precise answers.

Let's not pollute this thread further with random speculation.

@ramnivas

This comment has been minimized.

Copy link

@ramnivas ramnivas commented Jan 14, 2018

Related FYI: We recently finished moving to scalajs-bundler. Using it in the LibraryOnly mode, we are able to use @JSImport to reference external JS modules as well as project's own assets without experiencing the ~45 second per build (now it is more like 10 second--high, but not as bad). So yay!

The bundle size is still pretty high (the fullOptjs file ~550 KB gzipped + other libraries and assets) with each class of users (admin, general public, student, teacher, content creator) needing only a small portion of functionality. So the rest of the issues raised here is still relevant to us.

@gbogard

This comment has been minimized.

Copy link

@gbogard gbogard commented Nov 19, 2018

I know this issue has been standing still for quite a few months now, but I'm also very interested in this. Not all of my users need all the features of the app, so being able to split the bundles by page would be a huge win

@gzm0 gzm0 added this to the Post-v1.0.0 milestone May 22, 2019
@gzm0 gzm0 self-assigned this Nov 29, 2019
@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Nov 29, 2019

I am starting to look into this, now that 1.0 is mostly done.

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Nov 29, 2019

From what I understand we need two large building blocks. Definitions of splitting taken from [1]. Comments on feasibility, unmet requirements, misconceptions are very welcome.

Step 1: Split output

Split the SJS output in multiple modules that import each other. This would enable "Entry point splitting": For example, split a user facing app and an admin interface.

Random thoughts:

  • Splitting by class seems most intuitive
  • Can bundlers deal with this number of chunks?
  • What to do with circular dependencies
  • If we need to be smarter, incrementality might be an issue.
  • Splitting by artifact (Java style module?) might be another alternative.

Step 2: Dynamic loads

Load modules dynamically import() instead of import so that even single page apps can load partially.

Random thoughts:

  • Will likely need language support because asynchronicity "leaks".
  • Probably best driven by the developer anyways (and kept explicit).
  • Making sure an intended separation stays separated might be a challenge.
  • If splitting is by artifact / module, introduce "dynamic dependencies"?
@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Nov 29, 2019

Step 1

  • At least Windows itself has serious issues with large number of files. Splitting every class is going to be completely prohibitive on that platform, regardless of the bundler.
  • I think we can probably provide a more-or-less custom config to specify where to put the boundaries. The minimum would be to allow to specify a set of packages that should go together.

Dynamic loads

  • The minimum would be just provide it as is, relying on the developer to use js.import(...) where appropriate.
  • We could provide some better language support in the form of a js.dynamicConstructorOf[C] and js.dynamicLoad(Obj) where C and O are @JSImport classes and objects. Those could return a better typed Promise than the bare js.import.
  • We could go further with a js.dynamicLoad { <expr> } that would isolate in <expr> all the references to @JSImport classes and objects, and request them all via an underlying js.import(...) before proceeding with the evaluation of the expr. This should probably be done in a macro outside of the core.
@ramnivas

This comment has been minimized.

Copy link

@ramnivas ramnivas commented Nov 29, 2019

Very glad to this discussion getting restarted!

Split output

  • This, by itself, will be a huge productivity boost. In our application, for example, we can't use hot-loading with React due to how it treats Context. Generally, context providers sit very high-up in the component tree and subtrees using consumers are updated whenever the provider changes the context. React uses reference equality in detecting if context is changed. With a single -fastOpt.js, with every change, all providers are recreated, and thus creating a new context, thus invalidating all consumers (which, almost always, means effectively every component).

  • Custom config to define the unit of the split seems fine. Ideally, there are predefined options such as per-class, per-package, etc. Is per-source file a viable option? If so, that may be a fine default, since it mirrors the JS world (and Windows developer choose another option if the number of files becomes a problem).

Dynamic loads

@sjrd By your references to JSImport classes/object, do you mean, dynamic loading will be limited to external javascript-defined code?

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Nov 29, 2019

By your references to JSImport classes/object, do you mean, dynamic loading will be limited to external javascript-defined code?

Ah! That's a good point. So we'll also need to support that for Scala classes and objects. Somehow ...

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Nov 30, 2019

Ok, seems like manual config for splitting is a good starting point. Thoughts:

  • Package prefix is probably the best starting point, no point in specifying scala, scala,collection, etc.
  • We can simply disallow cyclic dependencies so we only need to detect them, not avoid them.

RE dynamic loads: @ramnivas you beat me to it. I think in practice this will be not as invasive as it seems, we might get away with defining an Async IR node and some compiler primitives. What I'm unsure about is how to make sure the splitting happens as expected by the developer even under complex conditions.

@japgolly

This comment has been minimized.

Copy link
Contributor

@japgolly japgolly commented Nov 30, 2019

Package prefix is probably the best starting point, no point in specifying scala, scala,collection, etc.

How about a regex over FQCNs?

@godenji

This comment has been minimized.

Copy link

@godenji godenji commented Dec 11, 2019

Great news! This has been the single biggest blocker wrt going all-in on Scala/Scala.js.

Massive 👍

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Dec 24, 2019

Thinking further about selectors. Requirements I can think of:

  1. Concept of root module (Scala.js needs this for its runtime)
  2. Fully qualified class name prefix based classification.
  3. Check non-intersection of specifiers without a concrete list of classes (ensures a wrong config is caught early on).
  4. Option to fail if unknown class is present (instead of putting it into a default module)
  5. Option to put unknown classes in a default module.

Notes:

Requirement 3 rules out regexes over FQCNs (/cc @japgolly). Personally I find verifiability more important than flexibility but I'm willing to discuss this.

Requirement 4/5: Unclear what the default should be, especially in view of standard libraries and classes required by the SJS runtime itself (which must be in the root module).

If the default is to have the root module as default module, these things neatly go automatically. However, it exposes the user to the risk of additional classes creeping into the default module (that might not belong there).

Another option would be to have a "smart default" where some "well known" packages are classified into the root module automatically. The downside is that specifying how this default evolves and how one can interact with it (how to tweak parts of it) is more difficult.

Please LMK what you think.

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 24, 2019

I don't think we need the concept of "root module" per se. Instead I suggest that we systematically put the core JS lib with whatever module contains java.lang.Object (rather than forcing java.lang.Object to be in the "root module"). This will make it simpler from a user's perspective, I believe.

@japgolly

This comment has been minimized.

Copy link
Contributor

@japgolly japgolly commented Dec 24, 2019

@gzm0 that's a smart consideration, don't worry about my regex suggestion. The scenarios that trade-off would affect are cases where you want to split packages into multiple modules and don't have control over the package structure, like 3rd party libs. In which case, prefixes over regex would be fine so long as:

  1. The prefix is of fqcn, not just package name. In other words, one can specify exact classes.
  2. There's some kind of rule prioritisation mechanism that allows overlap (like acls) so that you could say a.b.C (class) goes to module X, and the remaining a.b. descendants go to module Y. Actually that would probably be pretty useful just for packages too: eg. "java.time goes here, rest of java goes there". You wouldn't want to specify each and every java.xxx where xxx is everything except time. The typical way of doing this is just ordered list of include/exclude rules that allow overlap. Hmmmm maybe req 3 isn't desirable after all...
@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Dec 27, 2019

Thanks for the suggestions.

@sjrd using java.lang.Object as proxy for where the core JS lib goes is excellent. Definitely easier to understand and equally flexible.

@japgolly thanks for the comment. This made me think of another use case: classify certain classes automatically into the "minimal required" module. Probably not something that we'd support in the first version, but it's good to keep it in mind so I design the API to be extensible for that (my current draft isn't).

The prioritization you describe is quite easily done: Say we have two Map[String, String]s: classToModule and packageToModule. For every class, if it's exact name is in classToModule, we take that. Otherwise, we keep looking up it's packages from the deepest one in packageToModule and take the first we find.

This is:

  • Always unambiguous (i.e. we never get two modules for a given class)
  • Allows the partial selections you describe (e.g. Map("java" -> "main", "java.time" -> "time")).

So ATM I'm tending towards something like this:

val mappings =
  ModuleMappings.package("java", "main") ++
  ModuleMappings.package("java.time", "time") ++
  ModuleMappings.class("java.time.Foo", "main") ++
  ModuleMappings.default("main")

Note that package and class need to be different because a.b.c can be a package (prefix) and a full class name at the same time (at least in our IR).

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Dec 27, 2019

@sjrd Other issue: How do I communicate SJS private symbols between modules? If module A needs a thing in module B, how does it access it without polluting module B's "public namespace"? Best solutions I could come up with:

  • Put everything on single object with a mangled name.
  • Create "export modules" which contain exported symbols and are always leaves.

I like option 2 more because it is cleaner. Downside is that the config surface becomes more complicated.

@sjrd

This comment has been minimized.

Copy link
Member

@sjrd sjrd commented Dec 27, 2019

I had option 2 in mind as well. I think it's the only bulletproof thing to do.

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Dec 27, 2019

Yeah... Downside is that with this, having a single file is not a mere special case of the multi file output :-/

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Feb 2, 2020

I have been thinking about how to build the interface to this for a while and I'm not sure the proposed mappings are really what we want.

Take the following config:

val mappings =
  ModuleMappings.package("myapp.core", "core") ++
  ModuleMappings.package("myapp.frontend", "frontend") ++
  ModuleMappings.package("myapp.backend", "backend") ++
  ModuleMappings.default("core")

There is (or might be) an implicit assumption in this configuration that frontend and backend depend on core but not on each other. Since this is implicit, it cannot be verified.

Therefore, if by happenstance, something in backend refers to frontend, the resulting module structure will be valid, but broken.

Hence, I'm toying with the idea of requiring this to be explicit. Something like:

val modules = Seq(
  Module("core")
    .withMappings(Mappings.default()),
  Module("frontend")
    .withMappings(Mappings.package("myapp.frontend"))
    .withDeps("core"),
  Module("backend")
    .withMappings(Mappings.package("myapp.backend"))
    .withDeps("core"),
)

This would both avoid this problem and make some of the implementation easier.

A potential downside is verbosity in configuration. A further caveat is how to deal with transitive dependencies (if we allow them, we can make a similar argument about undetected dependency creep, although it would only manifest later).

Opinions?

@godenji

This comment has been minimized.

Copy link

@godenji godenji commented Feb 7, 2020

I like it, looks similar to SBT's aggregate and dependsOn.

To avoid stringly typos would we be able to do something like:

val core = Module("core").withMappings(Mappings.default())
val modules = Seq(
  core,
  Module("frontend")
    .withMappings(Mappings.package("myapp.frontend"))
    .withDeps(core),
  Module("backend")
    .withMappings(Mappings.package("myapp.backend"))
    .withDeps(core),
)

?

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Feb 18, 2020

#3965 contains an RFC for an interface specification.

@ramnivas

This comment has been minimized.

Copy link

@ramnivas ramnivas commented Feb 20, 2020

I am not so sure that this package-based modules idea is the best way to proceed. A few reasons:

  • Package may not be the right level of separation. For example, we have a package paya.ui that has a bunch of generic UI components (Button, Slider, Dialog, etc.) defined. Not all parts of the code need not all the components.
  • It will make it hard to deal with some of the expected functionality in the JS world. For example, as I mentioned in #2681 (comment), hot-loading in the presence of React Context with less granular JS files will not work well.
  • It is significantly different than the JS way, requiring a different way of thinking, resulting in a higher barrier for newcomers considering ScalaJS. Working and keeping up-to-date with Webpack will require special ongoing effort, which our relatively small community may not be able to keep up with. And then there are new tools that exists or are being proposed in the JS world (for example, rollup, parcel, ESBuild, SWC). We cannot possibly support each of these tools.

So what could be done? Please note that I am sure that I have gotten a few details off. But hopefully, this will serve as a basis for discussion. Also, I may be repeating a few things said by me/others in the past, but trying to put a more complete picture here.

The core idea is to mimic the JS world as closely as possible. Then Webpack et al will have their assumption met (specifically that only a few files change at a time during development). I also don’t think we will need scalajs-bundler if this is implemented. Letting Webpack parse only a few JS files will be very performant and will not need the LibraryMode workaround and then there is very little is left for scalajs-bundler to do.

Consider the following set of classes:

class A {
  def display(): String = {
    val b = new B()
    b.hello()
  }
}

class B {
  def hello(): String = {
    "hello"
  }
}

Should lead to the creation of the following files:

// A.js
import {B} from './B';

export class A {
  display() {
    let b = new B()
    return b.hello();
  }
}

// B.js
export class B {
  hello() {
    return "hello"
  }
}

Basically, every type gets its own dedicated js file and includes an import for each referred type. One obvious objection is the sheer number of files produced. But a quick check in our real project reveals that we have 35770 js files (most of those coming from node_modules) and that doesn’t seem to be a problem (true, I am on a Mac; but Windows users, too, presumably have similar number of js files in their projects). Also cc @shadaj who has an idea of dealing with this issue, should it become a problem.

This, by itself, solves a bunch of problems. For example, unless the B scala type is modified, B.js will remain the same (at least in the fastOptJS mode, I hope). This means the React Context problem (which relies on reference equality) will no longer be an issue. Webpack will have minimal work to do during iterative development. And scalajs-bundler will be no longer required (short of providing integrated experience from within SBT; but that is, IMO, not super critical).

Now onto dynamic imports. Consider the following modified version of A.

class AA {
  def display(): Future[String] = {
    dynamicImport {
      val b = new B()
      b.hello()
    }
  }
}

Assuming that we have defined dynamicImport as the following (in Scala.js alongside JSImport and such):

def dynamicImport[T](thunk: => T): Future[T] // a marker method 

Will lead to the creation of the following A.js (B.js unchanged):

export class AA {
  display() {
    return import('./B').then(({B}) => {
      let b = new B()
      return b.hello();
    });
  }
}

Basically, each type referred inside a dynamicImport block will result in a corresponding import statement. With this generated code, Webpack will create a separate chunk for the code inside the import block.

Let’s also consider a case where static and dynamic imports share a type.

class AAA {
  def display(): Future[String] = {
    dynamicImport {
      val b = new B()
      b.hello()
    }
  }

  def display2(): String = {
    val b = new B()
    b.hello()
  }
}

Here, we could do nothing special and import B statically and then again dynamically and leave it up to Webpack to understand (which correctly handles the situation by not creating a separate chunk to load B). Or could do something special to avoid dynamically importing types that are already required in the static scope.

@gzm0

This comment has been minimized.

Copy link
Contributor

@gzm0 gzm0 commented Feb 21, 2020

@ramnivas thank you for raising your concerns!

I have a couple of clarification questions:

Package may not be the right level of separation. [...] Not all parts of the code need not all the components.

What I read between the lines is that you do not want to define a module per class (i.e. UI component) yourself here? (which with the current proposal is possible, but maybe cumbersome indeed).

It will make it hard to deal with some of the expected functionality in the JS world. [...] hot-loading in the presence of React Context with less granular JS files will not work well.

Could you clarify how the context maps to files and reloads? I do not know how react contexts are typically set-up and how the hot-loading works. From what I have understood so far, wouldn't it be enough to simply put the context provider in a separate module?

It is significantly different than the JS way, requiring a different way of thinking, resulting in a higher barrier for newcomers considering ScalaJS.

Could you clarify what the JS way is and how what we are doing is more different than necessary?

RE your proposal of per-class splits

How does this deal with cyclical dependencies? E.g.

class A { def foo = new B }
class B { def foo = new A }

I have a feeling that having cyclical imports is a recipe for disaster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.