Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with multiple code branches in hooks linter #14661

Merged
merged 4 commits into from Jan 25, 2019
Merged

Fix issue with multiple code branches in hooks linter #14661

merged 4 commits into from Jan 25, 2019

Conversation

Yurickh
Copy link
Contributor

@Yurickh Yurickh commented Jan 23, 2019

I changed the algorithm to a simple DFS so we can find the correct number of possible paths from the hook to the start of the function.

I'm not sure I should change the other (similar) functions as I couldn't think of a breaking example for them.

Fixes #14362

@facebook-github-bot
Copy link

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@@ -396,30 +396,30 @@ exports[`ReactDebugFiberPerf supports Suspense and lazy 2`] = `
"
`;

exports[`ReactDebugFiberPerf supports portals 1`] = `
exports[`ReactDebugFiberPerf supports memo 1`] = `
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure why this changed, as I just run the tests, and didn't ask for snapshot updating at any time.

return 0;
}
function countPathsFromStart(segment, visited = new Set()) {
if (codePath.thrownSegments.includes(segment)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, since the linter always runs in a node environmemt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, IIRC, thrownSegments isn’t an array here, it’s just array-like.

Copy link
Contributor

@Jessidhia Jessidhia Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is an eslint@5 plugin, then yes as it's supported since node 6.

(Unless there are specific rules against it here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is a direct extract from the original code, this shouldn't really be an issue.

Copy link

@hg-pyun hg-pyun Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, It looks good.

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

*/

function countPathsFromStart(segment) {
const {cache} = countPathsFromStart;
Copy link
Contributor

@calebmer calebmer Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi 👋
I’m the original author of this ESLint rule popping in for a review.

The reason I used a cache here is that in complex components, time complexity can really start to spike up. Consider:

function MyComponent() {
  if (a) {} else {}
  if (b) {} else {}
  if (c) {} else {}
  useHook();
}

Here we have 23 = 8 paths from useHook() to the start of MyComponent. Representing the following combinations of a, b, and c.

a b c
true true true
false true true
true false true
false true false
true false false
false true false
false false true
false false false

So we have a 2n exponential relationship, fun. Now remember that every && and || (in the future perhaps also ??) introduces a condition since false && expensive() will not execute expensive().

Let’s say we have a complex component that has 5 conditions placed in the component before 6 hooks all in the same segment. Without a cache we have to call countPathsFromStart() on 25 paths 6 times. With a cache, we only need to call countPathsFromStart() on 2 × 5 segments because we cache the value for every segment so we only need to visit each segment once.

In big-O notation where “n” is the number of conditions and “h” is the number of hooks, we have O(2n) time complexity with a cache and O(2n × h) without a cache.

To see what I mean in practice try adding the following component to the test suite. On this branch, I became impatient after waiting about 10s for the test to finish. When I switched back to master the entire ESLint test suite finished in about 3s.

It’s up to the React team (cc @gaearon) to determine whether or not this performance regression is acceptable. 20 conditions and 10 hooks were fine for me on this branch, but 40 conditions and 10 hooks were not. If this performance regression is not acceptable then I recommend adding the below component to the test suite.

function MyComponent() {
  // 40 conditions
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}
  if (c) {} else {}

  // 10 hooks
  useHook();
  useHook();
  useHook();
  useHook();
  useHook();
  useHook();
  useHook();
  useHook();
  useHook();
  useHook();
}

Copy link
Contributor Author

@Yurickh Yurickh Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks for the chime in, explaining the original rationale behind this piece of code, that was really enlightening.

Unfortunately, I'm not sure there's a way to work around this performance regression. Using the cache as it is proved itself faulty and could lead to false negatives, as issue proved.
I could just remove the item from the cache once we finished visiting it, but the complexity of the overall algorithm would be the same.
I guess the complexity of the problem of finding the number of paths between two nodes on a graph can't really be reduced here.

I'll think about how we could improve it for the case of multiple hooks, so the complexity is not so high.

Copy link
Contributor

@calebmer calebmer Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments below with a possible solution and the rationale behind that solution. TL;DR when countPathsFromStart() breaks a cycle it gives a temporary result of 0 to avoid looping forever.

@calebmer
Copy link
Contributor

@Yurickh I found a fix for the problem that does not involve removing the cache. In countPathsFromStart() make the following change:

               paths += countPathsFromStart(prevSegment);
             }
           }
-          cache.set(segment.id, paths);
+          // If our segment is reachable then there should be at least one path
+          // to it from the start of our code path.
+          if (segment.reachable && paths === 0) {
+            cache.delete(segment.id);
+          } else {
+            cache.set(segment.id, paths);
+          }

           return paths;
         }

If you choose to use this fix, I encourage you to think about why this is the correct solution. In theory, segment.reachable && paths === 0 is a correct invariant to have for countPathsFromStart() but not for countPathsToEnd(). So if we detect a violation to this invariant we “forget” the computation which allows it to be re-computed later when we’ll get the correct result. (This has to do with how cycles are treated.)

Perhaps there is a more principled fix in the if (paths === null) { code. Since all we’re doing here is saying “we don’t like this result you gave us so try again later”.

If you choose to use this fix I also recommend adding the test case I included above to prevent future performance regressions.

@calebmer
Copy link
Contributor

A more in-depth look at this problem. To debug I added to the top of the countPathsFromStart() function. It will log the segment’s id along with some indentation so we can see the nesting of our calls. (I got this idea from one of Sophie’s tweets, actually)

console.log('  '.repeat(i) + segment.id);

That gives us:

s2_6
  s2_2
    s2_1
    s2_5
      s2_4
        s2_3
          s2_2
  s2_3

@LosYear’s diagram from #14362 is really helpful for understanding this output, so I’ll include it:

Adding some annotations to the log:

s2_6
  s2_2
    s2_1              <---- first segment
    s2_5
      s2_4
        s2_3          <---- first call to countPathsFromStart(s2_3)
          s2_2        <---- cycle broken
  s2_3                <---- second call to countPathsFromStart(s2_3)

Our first call to countPathsFromStart(s2_3) looks at the segment which comes before it, s2_2. However, we’ve already visited s2_2 so we’ve found a cycle! We break the cycle by returning 0 which means s2_3 gets a count of 0. However, 0 is not the correct value for s2_2! You’ll see that the first time we visit s2_2 it has two previous segments: s2_1, our initial segment, and s2_5. s2_1 is the initial segment so it gives us 1 path. s2_5 cycles back to s2_2 so it gives us 0 paths. Therefore the correct number of paths for s2_2 is 1, not 0. But we got 0 in our first call to countPathsFromStart(s2_3) because we were breaking the cycle.

So that’s why we delete the result when segment.reachable && paths === 0 is violated. We know that if we call countPathsFromStart(s2_3) again that this time s2_2 will have the correct value, 1, instead of 0.

@Yurickh
Copy link
Contributor Author

Yurickh commented Jan 24, 2019

Looks like this really solves the problem for good, without significant performance loss.
I'm feeling bad for taking the authorship of the PR now haha!

Thank you so much for the insight! This turned out being a great solution after all.

@sompylasar
Copy link
Contributor

I apologize in advance for the off-topic irrelevance of my question to this PR; there is no true "code review" feature of arbitrary piece of code at arbitrary time in GitHub yet, only "diff review".

I was curious and looked around the code of the eslint rule, and found this comment which raised a question in me:

        // Gets the function name for our code path. If the function name is
        // `undefined` then we know either that we have an anonymous function
        // expression or our code path is not in a function. In both cases we
        // will want to error since neither are React function components or
        // hook functions.
        const codePathFunctionName = getFunctionName(codePathNode);

and then:

            } else if (codePathFunctionName) {
              // Custom message if we found an invalid function name.
              const message =
                `React Hook "${context.getSource(hook)}" is called in ` +
                `function "${context.getSource(codePathFunctionName)}" ` +
                'which is neither a React function component or a custom ' +
                'React Hook function.';
              context.report({node: hook, message});

I reviewed the function getFunctionName(node) { and it looks like it's only concerned with use cases of finding a function name of a hook, not a function component.

In particular, I may write a function component this way, using an anonymous function:

const MyComponentWithStyles = withStyles((props) => {
  // some useAnyHook() code here
});

Would it become a problem that the anonymous function inside a call argument is meant to be a React function component, but it's not annotated to be that? Is this a rule now to not use anonymous functions to define React function components?

@Yurickh
Copy link
Contributor Author

Yurickh commented Jan 25, 2019

Hi, @sompylasar. In fact, you'll find an issue about your discoveries on #14404 (comment)

@sompylasar
Copy link
Contributor

@Yurickh Thanks! It's exactly that. I searched for eslint hooks issues but missed this one (oh I realized I might have had is:pr in the filters).

@gaearon gaearon merged commit e19c9e1 into facebook:master Jan 25, 2019
@gaearon
Copy link
Collaborator

gaearon commented Jan 25, 2019

Thank you!

This was referenced Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants