Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add --stats flag and implement basic stat tracking (closes #411) #799

Closed
wants to merge 1 commit into from

Conversation

@balajisivaraman
Copy link
Contributor

balajisivaraman commented Feb 13, 2018

Creating this PR for review purposes and as a WIP thread for adding the --stats flag to ripgrep.

  • Implement tracking for basic stats (no. of matches, files with matches, files searched, and time taken)
  • Ensure --stats is ignored when --files, --files-with-matches or --files-without-match are passed
  • Implement tracking of bytes searched for --no-mmap searches using the counting_reader trick. This will require changing the worker's run method to return an Optional bytes_searched attribute along with the existing match_count. We also need to ensure this happens only when --stats is passed.

Other thoughts:

  • Have used the default Duration implementation from std as did not want to pull in any other dependencies such as time or chrono, and this served the purpose very well.

  • Have defaulted the time to always display in seconds since this is what ag does as well. rg also prints only 3 numbers after the decimal point, as opposed to ag's 6.

    The only reasoning behind this is that Rust only allows us to control the digits printed after the decimal point, not overall. If we run a very long search, we don't want to print something like 2751.648931 seconds. So have chosen the nicer option of 3 digits after the decimal point.

    I think ag prints 6 digits overall, with the number of digits printed after the decimal point depending on the overall width of the string. (This is the behaviour I've observed; have not verified this with the code.)

@balajisivaraman

This comment has been minimized.

Copy link
Contributor Author

balajisivaraman commented Feb 20, 2018

I'll close this PR as the current implementation is simply wrong. When I looked at the code for this originally, I overlooked the fact that the searchers return matching line count and not the individual match count, which is what we want for the --stats flag. My apologies for that.

I'll work on the proper implementation, which I expect to be a bit more involving than this one. In the meantime, I don't want this clogging up the PR space here.

@balajisivaraman balajisivaraman deleted the balajisivaraman:fix_411 branch Feb 20, 2018
Copy link
Owner

BurntSushi left a comment

@balajisivaraman Thanks for working on this! I know you closed this, but you might consider that it is possible to get this merged without much additional work. For example, if you change "match count" to "matched line count" (or similar), then that would be correct and still useful IMO. :) I defer to you! Regardless, I left some small amount of feedback.

the time taken for the entire search to complete.
Note that this flag has no effect if --files, --files-with-matches or
--files-without-match is passed.");

This comment has been minimized.

Copy link
@BurntSushi

BurntSushi Feb 20, 2018

Owner

Can you put the "); on its own line please?


let arg = RGArg::switch("stats")
.help(SHORT).long_help(LONG);

This comment has been minimized.

Copy link
@BurntSushi

BurntSushi Feb 20, 2018

Owner

Remove this blank line please.

@@ -373,6 +395,15 @@ fn eprint_nothing_searched() {
Try running again with --debug.");
}

fn print_stats(match_count: u64, paths_searched: u64, paths_matched: u64, time_elapsed: Duration) {

This comment has been minimized.

Copy link
@BurntSushi

BurntSushi Feb 20, 2018

Owner

All lines of source code in ripgrep should be wrapped to 79 columns inclusive (excepting URLs or other things that are inconvenient to wrap). Please see examples of other function declarations for formatting tips.

@balajisivaraman balajisivaraman restored the balajisivaraman:fix_411 branch Feb 20, 2018
@balajisivaraman

This comment has been minimized.

Copy link
Contributor Author

balajisivaraman commented Feb 20, 2018

@BurntSushi, I wouldn't mind getting this PR merged at all. :-)

The only reason why I closed this was I'm going to begin work on the full implementation of it with bytes searched and individual count, so I thought we wouldn't want this anyway.

But seeing as that is more involving change with multiple moving parts, it might take some time. So it makes sense to go ahead with this partial implementation in the meanwhile.

I'll update the PR to reflect your comments. Thanks! :-)

@BurntSushi

This comment has been minimized.

Copy link
Owner

BurntSushi commented Mar 10, 2018

@balajisivaraman This was great work, thanks so much! I merged this in 00520b3 with no real changes other than fixing up style (lines should be limited to 79 columns inclusive) and resolving conflicts. I also tweaked the wording of output slightly to indicate that total matches actually corresponds to the total number of matched lines. (So we leave the door open to emit another statistic that reports the total matches.)

@balajisivaraman balajisivaraman deleted the balajisivaraman:fix_411 branch Mar 10, 2018
@balajisivaraman

This comment has been minimized.

Copy link
Contributor Author

balajisivaraman commented Mar 10, 2018

@BurntSushi, Thank you for merging this. And my apologies for not updating this PR. I was waiting for the other PR (match_line_count) to get merged so I could update it properly and completely forgot about it. Thanks again. 😄

@BurntSushi

This comment has been minimized.

Copy link
Owner

BurntSushi commented Mar 10, 2018

@balajisivaraman No worries! Sometimes my work style isn't amenable to waiting too long, and is susceptible to short bursts of maintenance activity. :-) Context switches are extremely expensive for me, so I figured that I would just amortize them by processing all of your PRs, which all were somewhat related!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.