Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MachineOutliner] Efficient Implementation of MachineOutliner::findCandidates() #88988

Closed

Conversation

xuanzhang816
Copy link
Contributor

We reduce the complexity of the main loop of findCandidates() method from O(n^2) to O(n log n).

We sort RS.StartIndices in SuffixTree and change replace find_if function with a simple check.

For each SuffixTree::RepeatedSubstring RS, the time complexity to find a set of candidates that do not overlap with each other is O(n^2) where n is the number of occurrence of this repeated substring (i.e., the size of RS.StartIndices). This is due to the use of the find_if method which has complexity O(n). The quadratic runtime becomes a problem when n gets larger. For clang, with [3], the maximum n goes from 12k to 100k, and the time to complete the main loop in MachineOutliner::findCandidates() goes from 17 seconds to 120 seconds.

To improve the runtime, we implement a more efficient algorithm with complexity O(n log n), using the fact that once RS.StartIndices is sorted, find_if can be achieved with an O(1) check. The O(n log n) complexity comes from the sorting. For clang, with [1], the time to complete the loop is reduced to only 28 seconds.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant