Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add config option to use ripgrep for scanning files #1086

Merged
merged 3 commits into from May 28, 2019
Merged

Conversation

@rafeca
Copy link
Contributor

@rafeca rafeca commented May 24, 2019

Summary

This PR adds a config option to enable ripgrep powered find and replace on this package. The integration with ripgrep has been done in Atom core on atom/atom#19348 and this PR just opts in the find and replace package to use it.

Screenshot 2019-05-24 at 12 14 38

Benefits

Using ripgrep speeds up drastically the time to find files on any kind of repository (we're taking about up to 22X faster times):

Type Num Files Time (standard) Time (ripgrep) Improvements
Small 2K 940ms 62ms 15X faster 馃帀
Medium 30K 7.7s 620ms 12X faster 馃帀
Large (returning 5 results) 270K 129s 5.9s 22X faster 馃帀
Large (returning 26k results) 270K 142s 17.5s 8X faster 馃帀

(the last measure has been done to check the less favourable case for ripgrep, which is a search that returns a lot of results (26k) which need to be passed from the ripgrep process to Atom. This is still 8X times faster than the current search logic. We could limit the number of results, but this can be done as a separate PR).

Possible Drawbacks

The search behaviour with ripgrep is slightly different than the one currently implemented, and there may be still some edge cases to polish. There's more information about the changes in atom/atom#19348.

To mitigate that, I've done quite extensive tests but I'll keep checking for potential edge cases. Also, I'm planning to leave this config flag at least for 1 version of Atom so we can catch as many issues as possible before shipping this to users by default .

Applicable Issues

#1075

@rafeca rafeca changed the title Add config Add config option to use ripgrep for scanning files May 24, 2019
@rafeca
Copy link
Contributor Author

@rafeca rafeca commented May 28, 2019

Going to merge this one, if there's any feedback afterwards I'll handle it on a separate PR

@rafeca rafeca merged commit bdfdddd into master May 28, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rafeca rafeca deleted the ripgrep-option branch May 28, 2019
@rafeca
Copy link
Contributor Author

@rafeca rafeca commented May 28, 2019

As @maxbrunsfeld has mentioned on Slack, the ripgrep scanner also fixes #806, which is a long-standing issue regarding not ignoring VCS-ignored files when there's a filter path in the project search.

@dwelle
Copy link
Contributor

@dwelle dwelle commented Jun 17, 2019

Scanning the commit, it'd be great to allow to enable ripgrep's PCRE2 so that lookbehind assertions work, too (#571). AFAIK it has a small perf regression, so if that's a concern, best adding it as opt-in only (ala vscode).

@rafeca
Copy link
Contributor Author

@rafeca rafeca commented Jun 17, 2019

@dwelle thanks for the suggestion!

Would you be willing to contribute adding this as a config option?

@dwelle
Copy link
Contributor

@dwelle dwelle commented Jun 17, 2019

Sure, provided it's not time-sensitive because I reckon it'll take a while --- I'll have to set up a dev enviro and figure out how to run it (plus I have little time ATM, but who does). Alternatively, I can write the code + tests blindly, cross my fingers and let the CI and reviewers do the rest :).

Btw, should a new event metadata field be added, too?

@rafeca
Copy link
Contributor Author

@rafeca rafeca commented Jun 17, 2019

Thanks @dwelle! I don't think time sensitive, so it's fine 馃槃

Setting up the dev environment is quite easy, you can follow the instructions here and here, but if you prefer you can let the CI do the job 馃槶

Just take into account that the search logic is in Atom core. You can add a new option to the [workspace.scan() method] (https://github.com/atom/atom/blob/master/src/workspace.js#L2048) to use PCRE2, and then pass it from find-and-replace based on the config option that you create (as I did for ripgrep).

You don't need to care about the DefaultDirectorySearcher implementation (just mention on the configuration option description that it only works on ripgrep).

Also, make sure to create a new test on the workspace-spec.js file that asserts that the PCRE2 option works well (you can use any existing test as inspiration).

Regarding the metadata field, I don't think it's needed for now. Only if we see that the performance of the searches regress after this change we'll add that.

Thanks again for your help!

@dwelle
Copy link
Contributor

@dwelle dwelle commented Jun 30, 2019

@rafeca finally managed to free up some time to work on this.

Being on Win7 I've been trying to bootstrap the atom repo for quite some time but it keeps failing on the well known msbuild.exe failed with exit code: 1 error.

I believe I have the correct version of Visual C++ Build Tools, but as for the Windows 8 SDK, not sure. The installation shows as successful, but then doesn't show as installed when trying to install it again (to verify).

I'll try to re-install everything, but if I don't break through soon, I'll have to go with my original idea and write the code blind.

@witygass
Copy link

@witygass witygass commented Jul 23, 2019

Yaaaaas! Thank you so much. Alleviates one of the prime badgering points that my co-workers love to make fun of me for when I say that I still use atom after years of being a dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants