Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle circular dependency upon dependency removal #506

Conversation

@raejin
Copy link
Contributor

raejin commented Jan 3, 2020

Summary
This is a proposed fix for the issue I submitted #500. For problem statement, please refer to the issue linked to understand the scope of the bug.

Currently Metro does not handle dependency removal correctly in the presence of certain circular dependency. Since we rely on Metro server to produce correct dependency graph in hopes to bundle only the needed modules. This is especially an issue when a dependency is removed from the entry point, however, due to it having circular dependency, the updated dependency graph will leave out all its dependencies in the graph.

I added the following two more test cases to ensure the correctness of the algorithm:

  1. Remove B from E: removes a dependency with transient cyclic dependency
    image

  2. Remove B from E: removes a cyclic dependency which is both inverse dependency and direct dependency
    image

  3. Remove B from E: removes a sub graph that has internal cyclic dependency
    image

Feel free to propose more tests to ensure the correctness of the algorithm. I'm aware that this may introduce more expensive graph updates for certain scenarios, but I believe that ensuring the correctness is far more important for dependency graph updates.

Implementation

Previously, our implementation will stop removing circular dependency due to it having remaining inverseDependencies:

if (module.inverseDependencies.size) {
return;
}

This can be illustrated by this example:
image

When removing B from E, B will still have an inverse dependency A left. Henceforth, the rest of the graph remain untouched. My proposed solution is to have this async function canSafelyRemoveFromParentModule recursively checking all the inverse dependencies. In this example, we will look up inverse dependencies of A all the way to the end until there is no inverse dependency. We can only safely remove this dependency if and only if its end inverse dependency (in this case A will have B as its end inverse dependency) only has one path and the path is the same as the parent path.

Test plan

  • Updated 1 existing test, with 2 additional unit tests to ensure the correctness of the dependency removal logic handling with circular dependency.
yarn run jest packages/metro/src/DeltaBundler/__tests__/traverseDependencies-test.js
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jan 3, 2020

Codecov Report

Merging #506 into master will increase coverage by 0.02%.
The diff coverage is 90.9%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
+ Coverage   84.07%   84.09%   +0.02%     
==========================================
  Files         175      175              
  Lines        5864     5891      +27     
  Branches      973      981       +8     
==========================================
+ Hits         4930     4954      +24     
- Misses        822      825       +3     
  Partials      112      112
Impacted Files Coverage Δ
...ges/metro/src/DeltaBundler/traverseDependencies.js 94.81% <90.9%> (-1.49%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 984aab8...9193ee1. Read the comment docs.

inverseDependencies is not empty. Rather, we recursively check the
inverseDependencies to see if it eventually only points to the removed
module. If this is the case, then we need to proceed removing dependency
instead of returning early.
@raejin raejin force-pushed the raejin:fix-dependency-removal-bug-when-circular-dependency-happens branch from 2e2b4fb to ae87904 Jan 3, 2020
Copy link

noahsug left a comment

Looks great! How much does this impact performance?

'',
new Set(),
Comment on lines 368 to 369

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 4, 2020

Maybe make these args optional with default values

* Given `inverseDependencies`, tracing back inverse dependencies to
* see if it only leads back to `parentModule`.
*/
async function canSafelyRemoveFromParentModule<T>(

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 4, 2020

I think this function is synchronous? In which case we should remove async

This comment has been minimized.

Copy link
@raejin

raejin Jan 6, 2020

Author Contributor

Ahh, nice catch :)

// there isn't circular dependency. Thus, we check if it can be safely remove
// by tracing back the inverseDependencies.
if (
module.inverseDependencies.size &&

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 4, 2020

Can we do the module.inverseDependencies.size check in canSafelyRemoveFromParentModule

This comment has been minimized.

Copy link
@raejin

raejin Jan 6, 2020

Author Contributor

I think this may make it clear on what canSafelyRemoveFromParentModule function intends to do, as way to signify that if module.inverseDependencies.size is non zero then we need additional check for all of the inverse dependencies.

@cpojer

This comment has been minimized.

Copy link
Contributor

cpojer commented Jan 6, 2020

Thank you so much for this fix. I believe this looks good but do you mind answering @noahsug's questions? I am also curious if there is any measurable performance difference but overall I think it is probably fine – all this data is available in the graph already and no expensive I/O needs to be done for this. If it's less than ~100ms additional time spent for a graph of 10k modules with one change I think we can live with this.

@raejin

This comment has been minimized.

Copy link
Contributor Author

raejin commented Jan 16, 2020

@cpojer

Thank you so much for this fix. I believe this looks good but do you mind answering @noahsug's questions? I am also curious if there is any measurable performance difference but overall I think it is probably fine – all this data is available in the graph already and no expensive I/O needs to be done for this. If it's less than ~100ms additional time spent for a graph of 10k modules with one change I think we can live with this.

After running the initial solution with our entrypoint which amounts to 8730 modules, there was obvious performance bug with it (~10s for removing the entrypoint).

With more investigation, I implemented a memoized version which helps avoiding unnecessary DFS. With the updated approach, we're looking at 1~2 seconds by removing an entrypoint which has 8730 modules. I also added more tests around various edge cases that I came up with while debugging with our entrypoint.

@raejin raejin force-pushed the raejin:fix-dependency-removal-bug-when-circular-dependency-happens branch from 76f8c9c to fb93ce0 Jan 16, 2020
memoized solution to short circuit any situation when a module does not
need further DFS.
@raejin raejin force-pushed the raejin:fix-dependency-removal-bug-when-circular-dependency-happens branch from fb93ce0 to d194e41 Jan 16, 2020
Copy link

noahsug left a comment

Nice work!! My comments are mostly style stuff

for (const dependency of module.dependencies.values()) {
removeDependency(module, dependency.absolutePath, graph, delta);
}
await Promise.all(

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

I don't think there's anything asynchronous going on here, so we can remove the await Promise.all(. See below for comment.

removeDependency(module, dependency.absolutePath, graph, delta);
}
}
await Promise.all(

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

I don't think there's anything asynchronous going on here, so we can remove the await Promise.all(. See below for comment.

This comment has been minimized.

Copy link
@raejin

raejin Jan 17, 2020

Author Contributor

will do!

return canSafelyRemove;
}

async function removeDependency<T>(

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

I don't think this is doing anything asynchronous, so we can remove async.

Since javascript is single threaded, the recursive await Promise.all( below is actually running synchronously (unless I'm missing something - we're not using jest-worker or anything truly async, right?)

This comment has been minimized.

Copy link
@raejin

raejin Jan 17, 2020

Author Contributor

ahhh noo I see what you meant. I think I did remove all the necessary await. updating to address this!

const result = getAllTopLevelInverseDependencies(
inverseDependencies,
graph,
'',

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

nit: a comment here like '', // current module name could be helpful since there isn't a named variable to explain what it does

* this can happen when trying to see if we can safely remove from
* a module that was deleted. This is why we filtered them out with `delta.deleted`
* 2. We have one top module and it is parentModule
*

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

nit: remove empty line, and comment length seems inconsistent. I'm not sure what the style guidelines are on that

delta: Delta,
): boolean {
const visited = new Set();
const result = getAllTopLevelInverseDependencies(

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

maybe rename result to inverseDependencies? Since result isn't actually the result the function returns

return true;
}

const filterNotDeletedResult = Array.from(result).filter(

This comment has been minimized.

Copy link
@noahsug

noahsug Jan 16, 2020

could rename to something like undeletedInverseDependencies

@raejin raejin force-pushed the raejin:fix-dependency-removal-bug-when-circular-dependency-happens branch from fa6ce94 to b836f22 Jan 17, 2020
cpojer added 3 commits Jan 17, 2020
@cpojer

This comment has been minimized.

Copy link
Contributor

cpojer commented Jan 17, 2020

@raejin

With more investigation, I implemented a memoized version which helps avoiding unnecessary DFS. With the updated approach, we're looking at 1~2 seconds by removing an entrypoint which has 8730 modules. I also added more tests around various edge cases that I came up with while debugging with our entrypoint.

Just wanted to clarify this is not the usual case for every change, only when removing a module that has 8k+ dependencies of its own, right? Removing a module here and there won't have a big performance impact, is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.