New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immutable data + bad performance #1262

Closed
elado opened this Issue Jan 22, 2016 · 12 comments

Comments

4 participants
@elado

elado commented Jan 22, 2016

At real-world's entities reducer implementation, https://github.com/rackt/redux/blob/master/examples%2Freal-world%2Freducers%2Findex.js#L10

This approach seems to be super slow with bigger data sets (say, 1000 entities).

lodash.merge has its own performance problem, as 1000 entities take 2.5s, but spread copy {...state, ...} is still slow (~120ms)

Here is a comparisons of merge/spread/partial copy/mutable: http://jsbin.com/miroto/3/edit?js,console

Obviously, Redux lives on immutable data, so it needs to stay immutable. How do you deal with this performance hit when dealing with many entities?

@babsonmatt

This comment has been minimized.

Show comment
Hide comment
@babsonmatt

babsonmatt Jan 22, 2016

@elado, have you tried ImmutableJS by any chance? I created an example method that fits your jsbin:

var users = _.times(1000).map(i => ({ type: 'user', id: i, name: `user ${i}` }));
var immutableUsers = Immutable.fromJS(users);

function immutableMerge() {
  return immutableUsers.reduce((state, u, key, iter) => {
     return state.mergeDeep({ [u.get('type')]: { [u.get('id')]: u } });
  }, Immutable.fromJS({ user: {} }))
}
{ lodashMerge: { state: { user: [Object] }, duration: 2281 },
  immutableMerge: { state: { user: [Object] }, duration: 62 } }

Edit: http://jsbin.com/cuvolasaja/edit?js,console

babsonmatt commented Jan 22, 2016

@elado, have you tried ImmutableJS by any chance? I created an example method that fits your jsbin:

var users = _.times(1000).map(i => ({ type: 'user', id: i, name: `user ${i}` }));
var immutableUsers = Immutable.fromJS(users);

function immutableMerge() {
  return immutableUsers.reduce((state, u, key, iter) => {
     return state.mergeDeep({ [u.get('type')]: { [u.get('id')]: u } });
  }, Immutable.fromJS({ user: {} }))
}
{ lodashMerge: { state: { user: [Object] }, duration: 2281 },
  immutableMerge: { state: { user: [Object] }, duration: 62 } }

Edit: http://jsbin.com/cuvolasaja/edit?js,console

@pke

This comment has been minimized.

Show comment
Hide comment
@pke

pke Jan 22, 2016

@babsonmatt hmm this is the output I get:

[object Object] {
  copyRoot: [object Object] {
    duration: 16,
    state: [object Object] { ... }
  },
  copyRootAndList: [object Object] {
    duration: 300,
    state: [object Object] { ... }
  },
  immutableMerge: [object Object] {
    duration: 262,
    state: [object Object] { ... }
  },
  lodashMerge: [object Object] {
    duration: 7466,
    state: [object Object] { ... }
  },
  mutable: [object Object] {
    duration: 0,
    state: [object Object] { ... }
  },
  spreadCopy: [object Object] {
    duration: 355,
    state: [object Object] { ... }
  }
}
[object Object] {
  copyRoot: [object Object] {
    duration: 5,
    state: [object Object] { ... }
  },
  copyRootAndList: [object Object] {
    duration: 544,
    state: [object Object] { ... }
  },
  immutableMerge: [object Object] {
    duration: 376,
    state: [object Object] { ... }
  },
  lodashMerge: [object Object] {
    duration: 9243,
    state: [object Object] { ... }
  },
  mutable: [object Object] {
    duration: 1,
    state: [object Object] { ... }
  },
  spreadCopy: [object Object] {
    duration: 606,
    state: [object Object] { ... }
  }
}

so mutable is the fastest?

pke commented Jan 22, 2016

@babsonmatt hmm this is the output I get:

[object Object] {
  copyRoot: [object Object] {
    duration: 16,
    state: [object Object] { ... }
  },
  copyRootAndList: [object Object] {
    duration: 300,
    state: [object Object] { ... }
  },
  immutableMerge: [object Object] {
    duration: 262,
    state: [object Object] { ... }
  },
  lodashMerge: [object Object] {
    duration: 7466,
    state: [object Object] { ... }
  },
  mutable: [object Object] {
    duration: 0,
    state: [object Object] { ... }
  },
  spreadCopy: [object Object] {
    duration: 355,
    state: [object Object] { ... }
  }
}
[object Object] {
  copyRoot: [object Object] {
    duration: 5,
    state: [object Object] { ... }
  },
  copyRootAndList: [object Object] {
    duration: 544,
    state: [object Object] { ... }
  },
  immutableMerge: [object Object] {
    duration: 376,
    state: [object Object] { ... }
  },
  lodashMerge: [object Object] {
    duration: 9243,
    state: [object Object] { ... }
  },
  mutable: [object Object] {
    duration: 1,
    state: [object Object] { ... }
  },
  spreadCopy: [object Object] {
    duration: 606,
    state: [object Object] { ... }
  }
}

so mutable is the fastest?

@gaearon gaearon added the discussion label Jan 22, 2016

@gaearon

This comment has been minimized.

Show comment
Hide comment
@gaearon

gaearon Jan 22, 2016

Collaborator

I'm a bit confused as to what the test is doing.
It's copying all the entities but only a few normally change when merging the response in.

Why does it copy every single entity?
And why are entities in an array rather than being indexed by IDs in an object?

Collaborator

gaearon commented Jan 22, 2016

I'm a bit confused as to what the test is doing.
It's copying all the entities but only a few normally change when merging the response in.

Why does it copy every single entity?
And why are entities in an array rather than being indexed by IDs in an object?

@elado

This comment has been minimized.

Show comment
Hide comment
@elado

elado Jan 22, 2016

@babsonmatt Thanks! I'll explore this today

@pke yes, obviously mutable is the fastest but it won't work with how redux expects the data, which is immutable

@gaearon

  • users is a mock of data that came from the server so it's an array
  • The reduce functions (act like reducers) inject all incoming users into the initial state { user: {} }
  • Eventually users are indexed by id { user: { '1': {id: 1, name: 'u1'}, '2': {id: 2, name: 'u2'}, ... } }
  • On each incoming user, the entire list (state.user) is copied, otherwise, user property will stay the same.
  • edit - untouched entities remain the same, they are just copied

Maybe it's ok to assume that the app doesn't care about state.user mutability? Then I can just inject new users into the same state.user, and maybe manage a list of IDs that would be immutable. Although I'm sure it will cause confusion.

edit 2 updated version of the jsbin: http://jsbin.com/mosiri/2/edit?js,console

elado commented Jan 22, 2016

@babsonmatt Thanks! I'll explore this today

@pke yes, obviously mutable is the fastest but it won't work with how redux expects the data, which is immutable

@gaearon

  • users is a mock of data that came from the server so it's an array
  • The reduce functions (act like reducers) inject all incoming users into the initial state { user: {} }
  • Eventually users are indexed by id { user: { '1': {id: 1, name: 'u1'}, '2': {id: 2, name: 'u2'}, ... } }
  • On each incoming user, the entire list (state.user) is copied, otherwise, user property will stay the same.
  • edit - untouched entities remain the same, they are just copied

Maybe it's ok to assume that the app doesn't care about state.user mutability? Then I can just inject new users into the same state.user, and maybe manage a list of IDs that would be immutable. Although I'm sure it will cause confusion.

edit 2 updated version of the jsbin: http://jsbin.com/mosiri/2/edit?js,console

@gaearon

This comment has been minimized.

Show comment
Hide comment
@gaearon

gaearon Jan 22, 2016

Collaborator

So the server response is like this:

[{
  type: 'user',
  id: 0,
  name: 'user 0'
}, {
  type: 'user',
  id: 1,
  name: 'user 1'
}]

and the state structure you want to get is like this:

{
  user: {
    0: {
      type: 'user',
      id: 0,
      name: 'user 0'
    },
    1: {
      type: 'user',
      id: 1,
      name: 'user 1'
    }
  }
}

Is this correct?

Collaborator

gaearon commented Jan 22, 2016

So the server response is like this:

[{
  type: 'user',
  id: 0,
  name: 'user 0'
}, {
  type: 'user',
  id: 1,
  name: 'user 1'
}]

and the state structure you want to get is like this:

{
  user: {
    0: {
      type: 'user',
      id: 0,
      name: 'user 0'
    },
    1: {
      type: 'user',
      id: 1,
      name: 'user 1'
    }
  }
}

Is this correct?

@elado

This comment has been minimized.

Show comment
Hide comment
@elado

elado Jan 22, 2016

@gaearon It could be an array like this or a single entity that I run an action to merge inside entitiesReducer state. The intention is to have normalized data, entities indexed by type and then by ID, exactly like real-world's eventual state.

elado commented Jan 22, 2016

@gaearon It could be an array like this or a single entity that I run an action to merge inside entitiesReducer state. The intention is to have normalized data, entities indexed by type and then by ID, exactly like real-world's eventual state.

@elado

This comment has been minimized.

Show comment
Hide comment
@elado

elado Jan 22, 2016

As for _.merge (what real-world uses) I filed an issue lodash/lodash#1865 but it's not something that's gonna be fixed.
So probably real-world shouldn't use _.merge.

elado commented Jan 22, 2016

As for _.merge (what real-world uses) I filed an issue lodash/lodash#1865 but it's not something that's gonna be fixed.
So probably real-world shouldn't use _.merge.

@gaearon

This comment has been minimized.

Show comment
Hide comment
@gaearon

gaearon Jan 22, 2016

Collaborator

_.merge was never supposed to be used in a loop. I think you're not using it in the correct way—you're supposed to first transform your input to be merge-able as is, and then call merge once rather than do it a thousand times.

Collaborator

gaearon commented Jan 22, 2016

_.merge was never supposed to be used in a loop. I think you're not using it in the correct way—you're supposed to first transform your input to be merge-able as is, and then call merge once rather than do it a thousand times.

@elado

This comment has been minimized.

Show comment
Hide comment
@elado

elado Jan 22, 2016

My data comes in a stream of entities that come in different times, each entity or a group of entities needs to be merged to the state.
The more data exists in the state, the longer it takes to merge it in.

To prevent UI re-render on each action, I debounce it. Explained here: reduxjs/react-redux#263

I'll probably end up using Immutable.js as it seems the fastest (interesting to see how it's solved inside).

elado commented Jan 22, 2016

My data comes in a stream of entities that come in different times, each entity or a group of entities needs to be merged to the state.
The more data exists in the state, the longer it takes to merge it in.

To prevent UI re-render on each action, I debounce it. Explained here: reduxjs/react-redux#263

I'll probably end up using Immutable.js as it seems the fastest (interesting to see how it's solved inside).

@gaearon

This comment has been minimized.

Show comment
Hide comment
@gaearon

gaearon Jan 22, 2016

Collaborator

This version is faster than the mutation code you wrote and is pure:

function somehow() {
  // These are supposed to be derived from state and action arguments
  let prevState = { user: {} };
  let fetchedEntities = users;

  // At first, shallowly copy the old state
  let newState = { ...prevState };
  let hasChanged = {};

  fetchedEntities.forEach(e => {
    let { type, id } = e;
    let newEntities = newState[type];

    // Unknown entity type: create an empty object
    if (!newEntities) {
      newEntities = newState[type] = {};
    }

    // If fetched entity is equal, avoid changing its reference
    if (_.isEqual(e, newEntities[id])) {
      return;
    }

    // Produce a new object (once!) if we're handling an entity of that type.
    // It's only created here so the function stays pure despite the internal mutation.
    if (!hasChanged[type]) {
      newEntities = newState[type] = { ...newEntities };
      hasChanged[type] = true;
    }

    // Reassign the entity (you can also merge with old version if you'd like)
    newEntities[id] = e;
  });

  return newState;
}

It also avoids changing the references for old dictionaries and individual items when we know they haven't changed.

Collaborator

gaearon commented Jan 22, 2016

This version is faster than the mutation code you wrote and is pure:

function somehow() {
  // These are supposed to be derived from state and action arguments
  let prevState = { user: {} };
  let fetchedEntities = users;

  // At first, shallowly copy the old state
  let newState = { ...prevState };
  let hasChanged = {};

  fetchedEntities.forEach(e => {
    let { type, id } = e;
    let newEntities = newState[type];

    // Unknown entity type: create an empty object
    if (!newEntities) {
      newEntities = newState[type] = {};
    }

    // If fetched entity is equal, avoid changing its reference
    if (_.isEqual(e, newEntities[id])) {
      return;
    }

    // Produce a new object (once!) if we're handling an entity of that type.
    // It's only created here so the function stays pure despite the internal mutation.
    if (!hasChanged[type]) {
      newEntities = newState[type] = { ...newEntities };
      hasChanged[type] = true;
    }

    // Reassign the entity (you can also merge with old version if you'd like)
    newEntities[id] = e;
  });

  return newState;
}

It also avoids changing the references for old dictionaries and individual items when we know they haven't changed.

@gaearon

This comment has been minimized.

Show comment
Hide comment
@gaearon

gaearon Jan 22, 2016

Collaborator

To clarify this.
If you create something inside your reducer it's fine to mutate it.

For example this is fine:

function reducer(state, action) {
  let arr = [state.something];
  arr.push(action.something);
  return arr;
}

I am mutating an array I just created so this mutation is completely unobservable from outside and it doesn't make the function any less pure—I can call it many times and get the same results.

However this is a bad mutation:

function reducer(state, action) {
  state.arr.push(action.something);
  return state;
}

In this case the mutation is observable and it will change the inputs, which is what we don't allow.

If a tree falls in the forest and nobody hears it, did it really fall?
Our answer is that mutating the stuff you created inside reducers is fine—mutating inputs is not.
Always use local mutation for better performance, just make sure you understand the difference.

Collaborator

gaearon commented Jan 22, 2016

To clarify this.
If you create something inside your reducer it's fine to mutate it.

For example this is fine:

function reducer(state, action) {
  let arr = [state.something];
  arr.push(action.something);
  return arr;
}

I am mutating an array I just created so this mutation is completely unobservable from outside and it doesn't make the function any less pure—I can call it many times and get the same results.

However this is a bad mutation:

function reducer(state, action) {
  state.arr.push(action.something);
  return state;
}

In this case the mutation is observable and it will change the inputs, which is what we don't allow.

If a tree falls in the forest and nobody hears it, did it really fall?
Our answer is that mutating the stuff you created inside reducers is fine—mutating inputs is not.
Always use local mutation for better performance, just make sure you understand the difference.

@gaearon gaearon closed this Jan 22, 2016

@gaearon

This comment has been minimized.

Show comment
Hide comment
@gaearon

gaearon Jan 22, 2016

Collaborator

Here's a slightly slower but simpler version relying on _.merge:

function lodashMerge2() {
  let prevState = { user: {} };
  let fetchedEntities = users;

  let stateToMerge = {};
  fetchedEntities.forEach(e => {
    let { type, id } = e;    
    stateToMerge[type] = stateToMerge[type] || {};
    stateToMerge[type][id] = e;
  });
  return _.merge({}, prevState, stateToMerge);
}

It's still 18 times faster than the ImmutableJS code in your example.
(Which can be optimized in a similar way as well!)

Scratch that, I don't think it's a good version because merge doesn't attempt to preserve the object identity if the merged objects are equal. This is why I'd still recommend hand-writing this code or writing your own merge() that attempts to preserve references when possible, as I did above.

Collaborator

gaearon commented Jan 22, 2016

Here's a slightly slower but simpler version relying on _.merge:

function lodashMerge2() {
  let prevState = { user: {} };
  let fetchedEntities = users;

  let stateToMerge = {};
  fetchedEntities.forEach(e => {
    let { type, id } = e;    
    stateToMerge[type] = stateToMerge[type] || {};
    stateToMerge[type][id] = e;
  });
  return _.merge({}, prevState, stateToMerge);
}

It's still 18 times faster than the ImmutableJS code in your example.
(Which can be optimized in a similar way as well!)

Scratch that, I don't think it's a good version because merge doesn't attempt to preserve the object identity if the merged objects are equal. This is why I'd still recommend hand-writing this code or writing your own merge() that attempts to preserve references when possible, as I did above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment