Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoization in Broccoli #396

Merged
merged 14 commits into from
May 16, 2019
Merged

Memoization in Broccoli #396

merged 14 commits into from
May 16, 2019

Conversation

thoov
Copy link
Member

@thoov thoov commented Apr 12, 2019

Background

Currently when a file change is detected broccoli will simply rebuild all of its nodes. Calling build can be expensive and results in nodes have to implement their own caching strategies (in best cases) or not caching (in worst cases). This PR "short circuits" the build chaining by enabling memoization (opt in via BROCCOLI_ENABLED_MEMOIZE=true) of nodes. This enables large amounts of the tree to no longer be rebuilt if broccoli detects that a nodes inputs have not changed (thus reducing rebuild times).

How does this work

  1. NodeWrappers (both source and transform) have a property called _revision that increments whenever their inputNodes have "changed"
  2. This property is changed one of 2 ways. For source nodes the watcher_adapter calls revise on them whenever sane detects a change. For transform nodes, revise is automatically called whenever its build method is called.
  3. During the build chain, we do not call the build method on a node if its inputNodes properties are the same as the last build attempt (a transform-node can "force" being built by passing volatile=true as an option).

Related PRs

- Use callbackObject to get the plugins _revision count
- Connect the watchRoot with the sane changed callback
@thoov
Copy link
Member Author

thoov commented Apr 16, 2019

@oligriffiths / @stefanpenner If you have some time could you both review this PR. I would like to discuss it at our Wednesday sync. The current status is still very much "work in progress". I just want to get your guy's input into the direction it is going and to talk about some of the "design" decisions I am making.

lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/watcher.js Outdated Show resolved Hide resolved
lib/watcher.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
- Renaming variables
- Cleanup _revision work
@oligriffiths
Copy link
Contributor

@thoov
Copy link
Member Author

thoov commented Apr 18, 2019

@oligriffiths I sync'd with @stefanpenner after our talk and he brought up 2 good points that I don't believe we gave enough weight too:

  1. Doing the FSTree diffing after every build will be very expensive. Thus it wont be a great "step 1" approach as it will be a negative step backwards from a perf point of view.
  2. Even with the eventual this.output facade there will still need to be a way for plugins to signal that things are "revised". Example would be if the plugin does anything outside of the facade (like create a random file somewhere outside its outputPath) then it would still have to call this.revised(); based on its own logic.

I propose the following phases:

Phase 1:

  • We add a revision counter on the base wrapper class. This will allow change identification for both source-wrappers (watchDirs) and transform-wrappers (plugins).

  • We add a revised method to plugins. Calling revised() calls a closure that was setup by broccoli that will increment a "revision" counter on that transform-wrapper. This will be future compatible with this.output facade as that facade will under the hood call revised().

  • During the build, wrappers will conditionally determine if the underlining build method should be invoked based on checking their inputWrapper revision counts. THIS POINT NEEDS TO BE AGREED UPON (@stefanpenner / @oligriffiths)

@oligriffiths has expressed a vision towards a shouldBuild method that wrappers would call into which return if a build should happen or not. This method would be able to access a this.inputs facade that would be able to call this.inputs.hasChanges(). Example:

Base class:

class BroccoliPlugin {
  shouldBuild() {
    return this.inputs.hasChanges();
  }
}

Custom class:

class MyPlugin extends BroccoliPlugin {
  shouldBuild() {
    if (Math.random() > .5) {
      return false;
    }
    
    return true;
  }
}

Phase 2:

  • We add a this.inputs and this.output facade onto plugins that wrap all "fs functionally". These facades with be able to "track" changes and under the hood invoke this.revised().

@oligriffiths
Copy link
Contributor

@thoov Yeah that makes sense.

I think we could combine the two approaches and have the concept "revised" and output path be linked with a directory object. So rather than doing the file system diffing, expose revised() on the Directory object that wraps the outputPath and will eventually be the fs facade, meaning you won't need to manually call revised() on it.

So

build() {
  // do my plugin build
  this.output.revised();
}

or something.

If we can decide on what the interface of the Directory facade should look like that solves the issue now, if we think that having an object to mediate file system stuff AND perform revision tracking, then we could just implement the revision tracking aspect right now and leave the file system stuff for later.

Another thought: if a plugin opts into sideEffectFree, could we register a watcher for the output path and call revised() automatically? Or would that add too much overhead and slow things down? I know we moved to the tmp dir to get rid of inotify events to speed things up.

@oligriffiths
Copy link
Contributor

Had some spare cycles so experimented with the above #397

@thoov thoov changed the title A spike on "pure / side effect free" broccoli Side effect free Broccoli May 1, 2019
@thoov thoov changed the title Side effect free Broccoli sideEffectFree Broccoli May 1, 2019
@thoov thoov changed the title sideEffectFree Broccoli Memoization in Broccoli May 2, 2019
Copy link
Contributor

@stefanpenner stefanpenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left 1 question

test/builder_test.js Outdated Show resolved Hide resolved
Copy link
Contributor

@oligriffiths oligriffiths left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is looking great.
Couple of points.

We need to support 2 separate configuration options:

  1. Manual revision tracking - in stead of broccoli auto calling .revise() after each build, the plugin can opt-in to calling it manually
  2. Volatile plugins - regardless of any input node changes, always call build

It seems you are achieving this via the memoize config option that can contain different values. I'm not sure this is the best API and I wonder if they should be separated out into separate options.

Second point, the name memorize I'm not sure really says what is happening here. What you're doing is opting into manual change tracking.
I also think that unless you mark your plugin as "volatile" that we should call shouldBuild() before every build.

LMK your thoughts

lib/watcher_adapter.js Outdated Show resolved Hide resolved
lib/wrappers/node.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
@stefanpenner
Copy link
Contributor

stefanpenner commented May 8, 2019

Just had a chat with @oligriffiths, the summary of our chat was:

  • Let's put this behind and environment flag, allowing users to opt-in via process.env.BROCCOLI_VOLATILE = true. This allows us to cut a release ASAP, and begin testing + gaining feedback from IRL scenarios. We can leave this flag around forever, but once ready we can flip the default.
  • We think, we can get away with dropping memoize: 'custom' for now – the biggest gains should not require it upfront, but let's let the real world inform if we, and how we need this functionality
  • to simplify adoption, we should make the circumvention of memoization be the config option. Specifically, { volatile: true } would circumvent, and no config option would default to the memoization goals this PR introduces.
  • As you may have noticed, we also have a proposal for the config name: volatile, although somewhat abrasive, it should be almost never actually need to be used, but when it is it is very descriptive.

Given the above mentioned changes, we should 🚢. If you have concerns, or if we have missed something important, let's chat. But otherwise, we feel really good about this.

lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/watcher_adapter.js Outdated Show resolved Hide resolved
lib/watcher_adapter.js Outdated Show resolved Hide resolved
@thoov thoov removed the request for review from rwjblue May 14, 2019 04:28
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/watcher_adapter.js Outdated Show resolved Hide resolved
lib/builder.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
lib/wrappers/transform-node.js Outdated Show resolved Hide resolved
@stefanpenner
Copy link
Contributor

:shipit:

@thoov thoov merged commit d4e4e0c into broccolijs:master May 16, 2019
@thoov thoov deleted the pure-broccoli branch May 16, 2019 20:18
@oligriffiths
Copy link
Contributor

👍

@lifeart
Copy link
Contributor

lifeart commented May 18, 2019

image
image

But, results not appear

@oligriffiths
Copy link
Contributor

@lifeart the print out shows the slowest nodes so skipped nodes will almost never show as they have zero time.

Which file did you make a change in to cause a rebuild and what else is in that tree?

@thoov
Copy link
Member Author

thoov commented May 19, 2019

@lifeart / @oligriffiths Can we move this conversation to a new issue? Could we also clearly state what the problem is and how to reproduce in that issue?

@lifeart
Copy link
Contributor

lifeart commented May 19, 2019

@thoov sure, I will try to create reproduction repo with latest ember and pined broccoli master.
@oligriffiths only component template.

@lifeart lifeart mentioned this pull request May 20, 2019
@lifeart
Copy link
Contributor

lifeart commented May 20, 2019

@thoov @oligriffiths #404

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants