New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate use of context + observedBits for performance optimization #1018

Open
markerikson opened this Issue Sep 12, 2018 · 19 comments

Comments

4 participants
@markerikson
Copy link
Contributor

markerikson commented Sep 12, 2018

I've written about this in several places recently, so I might as well consolidate the information as an issue. I'll copy and paste from the other stuff I've written as appropriate.

Prior discussions and materials for reference:

Summary and Use Case

In React-Redux v5 (the current version), every instance of a connected component is a separate subscriber to the store. If you have a connected list with 10K connected list items, that's 10,001 separate subscriber callbacks.

Every time an action is dispatched, every subscriber callback is run. Each connected component checks to see if the root state has changed. If it has, that component re-runs the mapState function it was given, and checks to see if the values it returned have changed. If any of them have, then it re-renders your real component.

In v6, we're changing it so that there's only one subscriber to the Redux store: the <Provider>. When an action is dispatched, it runs this.setState({storeState : store.getState()}), and then the store state is passed down to connected components using the new React.createContext API. When React sees the value given to the Context.Provider has changed, it will force all of the associated Context.Consumer instances to update, and that's when the connected components will now re-run their mapState functions.

So, in v5, there's 10,001 subscriber callbacks and 10,001 mapState functions executed. In v6, there's only 1 subscriber callback executed, but still 10,001 mapState functions, plus the overhead of React re-rendering the component tree to update the context consumers. Based on what we've seen so far, the overhead of React updating is a bit more expensive than it was to run all those subscriber callbacks, but it's close. Also, there's other benefits to letting React handle this part of the work (including hopefully better compatibility with the upcoming async timeslicing stuff).

However... as many people have pointed out, in most apps, for any given action and state update, most of the components don't actually care about the changes. Let me give a different example. Let's say that our root state looks like {a, b, c, d}. Doesn't matter what's inside those, but for sake of the argument let's say that each top-level slice holds the data for 2500 items, and a separate connected component for each item.

Now, imagine we dispatch an action that updates, say, state.a[1234].value. state, state.a, and state.a[1234] will be updated to new references by our reducers, but state.b, state.c, and state.d are all the same.

That means that only 2500 out of 10K components would have any interest in the changes at all - the ones whose data is in state.a. The other 7500 could really just skip re-running their mapState functions completely, because the top-level slices their data is in haven't changed.

So, what I imagine is a way for a connected component to say "hey, I only want to re-run my mapState function if state.b or state.c have changed". (Technically, you could do something sorta like this now with some of the lesser-known options to connect, I think, but lemme keep explaining.) If we did some magic to turn those state slice names into a bitmask pattern, and <Provider> ran a check to calculate a bitmask pattern based on which state slices did change, then we could potentially use that to skip the update process entirely for any components that didn't care about these changes, and things would be a lot faster as a result.

Where the proxy stuff comes in would be some real "magic" , where we could maybe "automagically" see which state fields your mapState function tries to read. It's theoretically possible, but very very complex with lots of edge cases. If that doesn't work, we'd maybe let you pass an option to connect that says subscribeToStateSlices : ["a", "c"] or something like that.

Potential Implementation Details

Calculating Changed Bits from State

In the RFC, I wrote this hand-waved function:

function calculateChangedBitsForPlainObject(currentValue, nextValue) {
    // magic logic to diff objects and generate a bitmask
}

Here's how I imagine an un-hand-waved implementation would work.

Let's say we've got a Redux store whose top-level state looks like {a, b, c, d}. The contents of those state slices is irrelevant - we can assume that the values are being updated immutably, as you're supposed to do with Redux.

A standard shallow equality check loops over the keys for two objects , and for each key, checks to see if the corresponding values in the two objects have changed. If any have changed, it returns false ("these objects are not equal"). I would want to implement a similar function that returns an array of all keys whose values have changed in the newer object (such as ["a", "c"]).

From there, I would run each key name through a hash function of some sort, which hashes the key string into a single bit. I briefly experimented with using a "minimal perfect hashing" library a while back as an early proof of concept.

So, every time the Redux store updated, the React-Redux <Provider> component, as the sole subscriber to the store, would determine which keys had their values changed and generate the changedBits bitmask accordingly. It wouldn't be a "deep" comparison of the state trees, but rather a quick and simple "top level" list of changes, at least as a default implementation.

Connected components would by default still subscribe to all updates, but we would provide a way for the end user to indicate that a connected component only cares about updates to certain state slices (hypothetical example: connect(mapState, null, null, {subscribeToStateSlices: ["a", "d"]})(MyComponent).

Calculating Component Usage of State Slices

This is the tricky part. In an ideal world, a connected component could wrap up the store state in an ES6 Proxy, call its mapState function with that wrapped state, and track which fields had been accessed. Unfortunately, there's tons of edge cases involved here. @theKashey has been doing a lot of research on this as part of the Remixx project he's working on. Quoting his comment:

So, let me explain what I've got today:

Proxyequal - as a "base layer". Wraps state and records all the operations on it. After being used it will produce a list and a trie with all values used, and how exactly they were used. Ie then you are doing state.a.b - you need only b, but could early-deside(memoize in this case) if a still the same.

This library is all you need to make a first step. But not the second step.

The problems are

  • selectors are not constant. In the different conditions, they may access different keys - state.items[props.key] - so their work should be reestimated every run.
  • there is no "every run" with memoization. On the second run memoized selector will do nothing.
  • this situation is easy to detect, proxyequal ships shouldBePure helper, which could test, that some function is doing the same every run.
  • but memoization and memoization cascades is something good to have.

So the plan is:

  • create a function, which could wrap any other memoization library, and "bubble" "access trie" to the top, in case of memoization
  • amend connect to throw in debug mode, if mapStateToProps is not absolutely pure
  • collect all "access tries" on the top level, and do the job.
  • but it is still a bit fragile.
  • probably call all mapStateToProps in dev mode (or in tests), and throw throw errors, if something got unpredicted updated, ie was not "reading" some keys before.

Another option:

  • add another option to connect, with information about "what it will do, which parts of the state it gonna access". In case of breach - console.log settings to be updated.

Ie be more declarative and let js help you write down these declarations.

All this stuff is hard to write, but easy to test. But this behavior doesn't sound like something production ready.

Use of calculateChangedBits

I had originally assumed that each Context.Provider could accepts a calculateChangedBits function as a prop. However, I was wrong - you actually pass one function as an argument to createContext, like React.createContext(defaultValue, calculateChangedBits). This is a problem, because we're likely to use a singleton-style ReactReduxContext.Provider/Consumer pair for all of React-Redux, and that would be created when the module is imported. It would be really helpful if we had the ability to pass it as a prop to each Context.Provider instead. That would let us be more dynamic with our change calculations, such as handling an arrayor an Immutable.js Map as the root state instead of a plain object.

At Dan Abramov's suggestion, I filed React RFC #60: Accept calculateChangedBits as a prop on Context.Providers to discuss making that change. Sebastian Markbage liked the writeup itself, but was hesitant about the actual usage of the bitmask functionality on the grounds that it might change in the future, and whether it was appropriate for our use case. The discussion hasn't advanced further since then.

Personally, I think the bitmask API is exactly what React-Redux needs, and that we're the kind of use case the API was created for.

One potential alternative might be a "double context" approach. In this approach, we might create a unique context provider/consumer pair whenever a React-Redux <Provider> is instantiated, use the singleton ReactReduxContext instance to pass that unique pair downwards to the connected components, and put the data into that unique context instance. That way, we could pass in a calculateChangedBits function to the createContext call . It seems hacky, but that was actually something Sebastian suggested as well.

Potential Improvements

@cellog did some quick changes to our existing stockticker benchmarks, which uses an array of entries. I think it was basically bitIndex = arrayIndex % 31, or something close to that - something simple and hardcoded as a proof of concept. He then tweaked his v6 branch (#1000 ) to calculate observedBits the same way, and compared it to both v5 and the prior results for that branch. I don't have the specific numbers on hand, but I remember he reported it was noticeably faster than the v5 implementation, whereas the prior results were a bit slower than v5. (Greg, feel free to comment with the actual numbers.)

So, conceptually it makes sense this could be an improvement, and our first test indicates it can indeed be an improvement.

Timeline

At this point, I don't think this would be part of a 6.0 release. Instead, we'd probably make sure that 6.0 works correctly and is hopefully equivalent in performance to v5, and release it. Then we could research the bitmasking stuff and put it out in a 6.x release, similar to how React laid the foundation with a rewrite in 16.0 and then actually released createContext in 16.3.

@theKashey

This comment has been minimized.

Copy link

theKashey commented Sep 12, 2018

32 bytes are not enough, I am not sure we could create a really good hash function to serve as a bloom filter. What about shouldContextUpdate?

I've actually have another idea for optimization, which works quite well in my case - react-redux-unbranch. Idea is simple - every connect is a Provider for all the nested connects. It form a real "tree", the same as reselect memoization cascade, and, if some keys of the store are not being used in the left branch, - it's easy to detect and ignore all the changes of the unused keys in the future.
Ie it does not track "the key usage", but work with "unusage".
As long after the each step deeper the tree you have less and less keys used in exactly "that" branch - sooner or later we will enter the state, when 32 bits are actually enough to represent changes.
What if maintain more than one connection to the redux store? Divide and Conqueror!

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Sep 12, 2018

If we're only concerned with top-level state slices (at least as an initial approach), then 31 bits is way more than enough. I'd say the majority of Redux stores I've seen have <15 top-level state slices, and even if there's more than 31, then having more than one key map to the same bit index certainly isn't a problem.

@theKashey

This comment has been minimized.

Copy link

theKashey commented Sep 12, 2018

Just tracking a top level slices is way more easy task, especially in terms if amount of information one have to store(just first part of a key used), but would help a lot in 99% cases.
It also wipes out memoization problem I've mentioned before, as long from run to run selector could access different keys, but almost every time it will use the same slices. This assumption could lead to proposal that any connect could only start using some slice, but not end.
This makes used keys hoisting much easier - one have to report only a few short slice names. This makes observability easier and faster, as long to track top level slices one need to wrap with proxies only Store itself, not deeply nested objects.
The only problem here - it's easy to make this slice-level tracking works, I could give you a working example in a day, but it will be hard to move to "two levels deep" tracking in the future.

Are you considering having 2 react-redux? One as I described - simple, and capable to handle 99% applications, and the second, we are unable to build right now, but may be in the future....

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Sep 12, 2018

No, I wouldn't want to try to maintain multiple variants.

As you've said, it's theoretically possible to get really fancy and track deeply nested state accesses, but limiting ourselves to top-level state slices would be a reasonable first step. It would probably be opt-in at first, and if it works well enough, we could consider making it the default behavior later.

FWIW, I recently ran a Twitter poll asking whether people use plain JS objects, Immutable.js, or something else as their Redux state tree: https://twitter.com/acemarke/status/1036729164538281984 . Discounting the 16% who answered "See Answers", 82% of respondents are using POJOs, and I'd bet that most of the people who are using Immutable.js still have a POJO as the root of their state tree. So, even if our solution only worked for a top-level state tree POJO, the vast majority of users could benefit from it.

@theKashey

This comment has been minimized.

Copy link

theKashey commented Sep 12, 2018

Sounds good for me. Look like there are 3 tasks as result:

  1. Find the changed slices on store update. Just a not shallow equal ones.
  2. Track the slice usage inside mapStateToProps. Using a simple one level proxy. Double check how memoization work with it, probably work well, as long something will has to access store.slice to perform shallow equal.
  3. Given names of used slices - produce a bit mask. Probably in keys enumeration order. Here one have to include "checksum" in case or reduces update and thus enumeration change.
  • More "deep" tracking could work as a separate layer
  • We don't need proxies for this stuff, but could use standard getters. This moment requires some performance testing.
@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Sep 12, 2018

Sounds reasonable. I think our two v6 branches (#995 and #1000 ) are sufficiently far along that we could try adding this functionality to them to see how well it works.

Perhaps there's some way we could set it up in "dummy/no-op mode" to begin with - calculate the changes, and somehow log/collect results on whether they would have been "correct" or not.

@theKashey

This comment has been minimized.

Copy link

theKashey commented Sep 12, 2018

Which one could I pick for a spike? (they both use Context, and everything else actually does not matter for this case)

@cellog

This comment has been minimized.

Copy link
Contributor

cellog commented Sep 12, 2018

Before changedbits, I was getting 14 fps, 5 was 16. After, I got 24 fps, 5 still at 16. It's a huge difference.

P. S. I don't think we need to change context API to accept a prop. We just accept context as a prop on our provider, and as a connect option in connect. This optimization is going to be a rare use case, most users won't need it

@cellog

This comment has been minimized.

Copy link
Contributor

cellog commented Sep 12, 2018

But you can see the difference in the react-redux-benchmarks repo for yourself, don't take my word for it

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Sep 12, 2018

@theKashey : go with #1000 for now, since Greg's already done some fiddling with changedBits on that branch.

@cellog

This comment has been minimized.

Copy link
Contributor

cellog commented Sep 12, 2018

FYI https://github.com/cellog/react-redux/blob/observedBits/src/utils/observedBits.js

This version is not unit tested, and has some known bugs. Basically, it allows you to pass in a state shape, and it spits out an object that can return a context to use, as well as utilities to pass in a key to the state map and get a bitmap back.

@cellog

This comment has been minimized.

Copy link
Contributor

cellog commented Sep 12, 2018

P.S. just learned that in a component without sCU, observedBits is pretty useless. If sCU simply returns false, however, it limits re-renders as expected. So we will want to use PureComponent to limit unnecessary re-renders when props don't change, or something similar

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Sep 13, 2018

@cellog : observedBits only comes into play when the Context.Provider is given a new value. If the component rendering the Context.Consumer re-renders, then yes, the render prop callback should execute too.

@cellog

This comment has been minimized.

Copy link
Contributor

cellog commented Sep 13, 2018

I think I was unclear. I constructed this test:

  test("context", () => {
    const mapper = bits.makeArrayMapper(40)
    const FancyContext = mapper.context()
    const updates = []

    class Li extends React.Component {
      shouldComponentUpdate() {
        return false
      }

      render() {
        const { i } = this.props
        return (
          <FancyContext.Consumer unstable_observedBits={mapper.getBits(i)}>
            {state => {
              updates.push(i)
              return <li>{state[i]}</li>
            }}
          </FancyContext.Consumer>
        )
      }
    }
    class Updates extends React.Component {
      constructor(props) {
        super(props)
        this.state = {
          info: Array(40)
            .fill(1)
            .map((_, i) => i + 1)
        }
        this.mapper = (thing, i) => <Li i={i} key={i} />
      }

      render() {
        return (
          <FancyContext.Provider value={this.state.info}>
            <ul>{this.state.info.map(this.mapper)}</ul>
            <button
              onClick={() =>
                this.setState(state => {
                  const info = [...state.info]
                  info[3] = "hji"
                  info[27] = "wow"
                  return { info }
                })
              }
            >
              set
            </button>
          </FancyContext.Provider>
        )
      }
    }

    const tester = rtl.render(<Updates />)
    expect(updates.length).toBe(40)
    rtl.fireEvent.click(tester.getByText("set"))
    expect(updates.length).toBe(43)
  })

but until I added that sCU that returns false, I was getting 80 updates. So it's not enough to use observedBits, we have to also prevent re-render the normal way too

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Sep 13, 2018

Yeah, I think that's still basically what I was saying. observedBits does nothing to stop a normal re-render. It only comes into play when a context update is being triggered. If the component that is rendering the consumer re-renders, everything inside it re-renders.

@theKashey

This comment has been minimized.

Copy link

theKashey commented Sep 16, 2018

@markerikson - PR drafted.

@timdorr

This comment has been minimized.

Copy link
Member

timdorr commented Nov 7, 2018

Looked into this in #1021. It didn't seem to make a huge difference for now.

@timdorr timdorr closed this Nov 7, 2018

@markerikson markerikson reopened this Nov 7, 2018

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Nov 7, 2018

I'd like to look at this further after 6.0 goes final, so I'll reopen it as a marker.

@markerikson

This comment has been minimized.

Copy link
Contributor

markerikson commented Nov 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment