Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'string-match-p' instead of 'string-match' where applicable #1517

Merged
merged 5 commits into from Aug 10, 2018

Conversation

@sergv
Copy link
Contributor

commented Apr 1, 2018

This should cause Emacs to do a little bit less work when matching regexps (judging from C code).

PS Although I tried to not use string-match-p if match data is subsequently accessed, I may have missed something. Please do re-verify that everything's ok before merging.

ivy.el Outdated
@@ -910,14 +910,14 @@ If the text hasn't changed as a result, forward to `ivy-alt-done'."
(postfix (car (last parts)))
(case-fold-search (ivy--case-fold-p ivy-text))
(completion-ignore-case case-fold-search)
(startp (string-match "^\\^" postfix))
(startp (string-matc-ph "^\\^" postfix))

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 1, 2018

Collaborator

s/matc-ph/match-p/

This comment has been minimized.

Copy link
@sergv

sergv Apr 1, 2018

Author Contributor

Fixed, thanks!

@sergv sergv force-pushed the sergv:master branch Apr 1, 2018

ivy.el Outdated
@@ -1963,14 +1963,14 @@ This is useful for recursive `ivy-read'."
(let ((w (length (number-to-string
(length ivy--all-candidates))))
(s (copy-sequence ivy-count-format)))
(string-match "%d" s)
(string-match-p "%d" s)
(match-end 0)

This comment has been minimized.

Copy link
@abo-abo

abo-abo Apr 3, 2018

Owner

This is a no-op right? And it was in the code before your change?

This comment has been minimized.

Copy link
@sergv

sergv Apr 3, 2018

Author Contributor

Oops, that's the issue I was talking about. This is not a no-op and my change is incorrect: before the change string-match modified global match data which was subsequently read by match-end. The string-match-p leaves match data intact, so match-end now reads whatever was previously in that global structure. I'll revert this - very unprofessional of me, sorry.

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 3, 2018

Collaborator

@sergv One of the calls to match-end is a no-op though, as its value isn't used anywhere.

This comment has been minimized.

Copy link
@sergv

sergv Apr 3, 2018

Author Contributor

Oh, indeed.

ivy.el Outdated
(match-end 0)
(string-match "%d" s (match-end 0))
(string-match-p "%d" s (match-end 0))
(setq s (replace-match (format "%%-%dd" w) nil nil s))

This comment has been minimized.

Copy link
@abo-abo

abo-abo Apr 3, 2018

Owner

Doesn't replace-match use match-data? I think this will introduce a bug. Can you confirm?

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 3, 2018

Collaborator

Not only replace-match uses match-data, but also the match-end passed to the second string-match from the first string-match. So neither of the two string-matches should be changed, at least at first glance.

This comment has been minimized.

Copy link
@sergv

sergv Apr 3, 2018

Author Contributor

Yes, my changes are wrong here.

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Apr 3, 2018

Thanks for the contribution. Two questions:

  • Do you have an Emacs Copyright Assignment?
  • Can you give a guesstimate of the speed-up? If it's not too big, maybe we should go for the flexibility during debug.
@sergv

This comment has been minimized.

Copy link
Contributor Author

commented Apr 3, 2018

Thanks for quick response! Per your questions:

  • No, I don't have an Emacs Copyright Assignment
  • The speed up is a bit tricky. So far I considered it to be just "faster" (the computer is going to do less work, how that can be slow?). But after your request I tried to come up with a benchmark that will show the difference. I was executing following code a couple of times, but unfortunately I didn't get any consistent results (a few runs are provided below). Sometimes string-match-p was marginally faster, sometimes string-match was faster and it was changing between the runs. I've even tried to manually unpack string-match-p and lift the way it disables group capture outside the benchmaark loop body, but that didn't really matter to the benchmarks.
(let ((re "a+\\(b*\\)c+")
      (input (concat (make-string 100 ?a)
                     (make-string 100 ?b)
                     (make-string 100 ?c)))
      (repeats 100000))
  (eval
   `(progn      
      (defun test1 ()
        (garbage-collect)
        (benchmark-run ,repeats
          (string-match-p ,re ,input)))
      (defun test2 ()
        (garbage-collect)
        (benchmark-run ,repeats
          (string-match ,re ,input)))
      ;; Manually unpacked `string-match-p' that sets `inhibit-changing-match-data'
      ;; outside the benchmark loop.
      (defun test3 ()
        (garbage-collect)
        (let ((inhibit-changing-match-data t))
          (benchmark-run ,repeats
            (string-match ,re ,input))))))
  (byte-compile #'test1)
  (byte-compile #'test2)
  (byte-compile #'test3)
  (list (list 'string-match-p (car (test1)))
        (list 'string-match (car (test2)))
        (list 'hacky (car (test3)))))

Some sample runs:

((string-match-p 0.646643222) (string-match 0.655887884) (hacky 0.653618295))
((string-match-p 0.655958689) (string-match 0.694801363) (hacky 0.6770779859999999))
((string-match-p 0.701573058) (string-match 0.688380931) (hacky 0.6642466770000001))
((string-match-p 0.67882954) (string-match 0.696929307) (hacky 0.69309022))
((string-match-p 0.6775590699999999) (string-match 0.729273346) (hacky 0.692307847))
((string-match-p 0.566093728) (string-match 0.563458819) (hacky 0.548185506))
((string-match-p 0.615039049) (string-match 0.620842084) (hacky 0.64707289))
((string-match-p 0.641689823) (string-match 0.634179539) (hacky 0.646996679))

Thus, given that

  1. The change is highly error-prone;
  2. There appears to be no significant speedup to be gained;
  3. We'll have to wait until Copyright Assignment process commences

I propose to drop this change as pretty minor and close this PR. Sorry for inconvenience.

@rgrinberg

This comment has been minimized.

Copy link

commented Apr 4, 2018

One advantage of using the -p variants is that you protect yourself against errors when you forget save-match-data. Swiper is unlikely to have such bugs, but it's still nice to be more secure.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented Apr 4, 2018

@rgrinberg That's very true. In addition, it's also important IMO to convey the correct intent to the reader, something I think comes up more frequently and has a greater chance of introducing subtle future bugs.

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Apr 4, 2018

No, I don't have an Emacs Copyright Assignment

Are you willing to get it? I can only merge changes up to 15 lines without a CA.

I propose to drop this change as pretty minor and close this PR. Sorry for inconvenience.

it's also important IMO to convey the correct intent to the reader, something I think comes up more frequently and has a greater chance of introducing subtle future bugs.

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Apr 4, 2018

No, I don't have an Emacs Copyright Assignment

Are you willing to get it? I can only merge changes up to 15 lines without a CA.

I propose to drop this change as pretty minor and close this PR.

Dropping it is an option, sure. But I do believe some small speed-up can be gained, and further more string-match-p conveys the intent better.

Sorry for inconvenience.

No trouble at all, this can be a good change provided it doesn't introduce bugs. Maybe we can do it in smaller chunks to make sure no string-match that actually needs to be there gets replaced.

it's also important IMO to convey the correct intent to the reader, something I think comes up more frequently and has a greater chance of introducing subtle future bugs.

Agreed.

@sergv sergv force-pushed the sergv:master branch Apr 4, 2018

@sergv

This comment has been minimized.

Copy link
Contributor Author

commented Apr 4, 2018

Okay, if there's consensus that this is a nice thing to have then lets keep the PR.I have rebased the branch and addressed the issues identified previously. I'll look into obtaining Emacs Copyright Assignment and get back to you.

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Apr 5, 2018

@sergv Thanks.

@sergv

This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2018

@abo-abo I have received confirmation that my copyright assignment to the FSF is now complete. I believe we can merge now?

ivy.el Outdated
@@ -1965,14 +1965,12 @@ This is useful for recursive `ivy-read'."
(let ((w (length (number-to-string
(length ivy--all-candidates))))
(s (copy-sequence ivy-count-format)))
(string-match "%d" s)
(match-end 0)

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 25, 2018

Collaborator

The call to match-end is indeed a no-op, but I'm not sure you can get rid of the call to string-match; there's a call to match-end just after these two forms.

This comment has been minimized.

Copy link
@sergv

sergv Apr 25, 2018

Author Contributor

string-match here is followed by another string-match, which would overwrite match data.

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 25, 2018

Collaborator

@sergv

string-match here is followed by another string-match, which would overwrite match data.

Yes, but the second string-match looks like it uses the match-end of the first string-match as its starting index.

This comment has been minimized.

Copy link
@sergv

sergv Apr 26, 2018

Author Contributor

I have found a way to replace two string-matches with a single one. Please take a look at Simplify ivy-add-prompt-count commit.

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 26, 2018

Collaborator

@sergv

I have found a way to replace two string-matches with a single one.

Nice, but why stop there? :)

diff --git a/ivy.el b/ivy.el
index 193af0a..d154bbf 100644
--- a/ivy.el
+++ b/ivy.el
@@ -1987,17 +1987,11 @@ ivy-add-prompt-count
         ((null ivy-count-format)
          (error
           "`ivy-count-format' can't be nil.  Set it to \"\" instead"))
-        ((string-match "%d.*%d" ivy-count-format)
-         (let ((w (length (number-to-string
-                           (length ivy--all-candidates))))
-               (s (copy-sequence ivy-count-format)))
+        ((string-match "%d.*\\(%d\\)" ivy-count-format)
+         (let* ((w (1+ (floor (log (max 1 (length ivy--all-candidates)) 10))))
+                (s (replace-match (format "%%-%dd" w) t t ivy-count-format 1)))
            (string-match "%d" s)
-           (match-end 0)
-           (string-match "%d" s (match-end 0))
-           (setq s (replace-match (format "%%-%dd" w) nil nil s))
-           (string-match "%d" s)
-           (concat (replace-match (format "%%%dd" w) nil nil s)
-                   prompt)))
+           (concat (replace-match (format "%%%dd" w) t t s) prompt)))
         ((string-match "%.*d" ivy-count-format)
          (concat ivy-count-format prompt))
         (ivy--directory

This way, there is less searching, less code, way less string allocation, and you also get to fix the replace-match arguments FIXEDCASE and LITERAL which shouldn't have been nil to begin with.

ivy.el Outdated
@@ -2955,13 +2953,13 @@ All CANDIDATES are assumed to match NAME."
"Re-sort candidates by NAME.
All CANDIDATES are assumed to match NAME.
Prefix matches to NAME are put ahead of the list."
(if (or (string-match "^\\^" name) (string= name ""))
(if (or (string-match-p "^\\^" name) (string= name ""))

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 25, 2018

Collaborator

Should the regexp here and in a couple of places further down be:

"\\`\\^"

?

This comment has been minimized.

Copy link
@sergv

sergv Apr 25, 2018

Author Contributor

Well, if there's only a single line in the name string then there would be no difference. I don't have a strong opinion in this case so I left it as is since it appears to be working OK.

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 25, 2018

Collaborator

Well, if there's only a single line in the name string then there would be no difference.

Yes, but if this code shouldn't be matching line beginnings then it shouldn't be matching line beginnings. My question is whether this code should be matching line beginnings. :)

ivy.el Outdated
@@ -3831,7 +3829,7 @@ Don't finish completion."
(interactive)
(delete-minibuffer-contents)
(if (and ivy--directory
(string-match "/$" (ivy-state-current ivy-last)))
(string-match-p "/$" (ivy-state-current ivy-last)))

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 25, 2018

Collaborator

Should the regexp here be:

"/\\'"

?

This comment has been minimized.

Copy link
@sergv

sergv Apr 25, 2018

Author Contributor

As with replacement of ^, there is no difference if we're matching against a single line. I don't have a strong opinion on the matter. Do you feel it should be addressed?

This comment has been minimized.

Copy link
@basil-conto

basil-conto Apr 25, 2018

Collaborator

Do you feel it should be addressed?

Unless I'm missing something, then yes, because newlines are valid filename constituents. This can obviously be done as a separate commit or PR; I'm just drawing attention to it while we're here.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented Jul 19, 2018

What's the status on this?

@sergv sergv force-pushed the sergv:master branch Jul 19, 2018

@sergv

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2018

@basil-conto I've replaced "^" by "\`" and "$" by "\'", rebased the branch and I also got the Copyright Assignment. I think the PR can be merged now.

ivy.el Outdated
@@ -3918,7 +3913,7 @@ Don't finish completion."
(interactive)
(delete-minibuffer-contents)
(if (and ivy--directory
(string-match "/$" (ivy-state-current ivy-last)))
(string-match-p "/\\'" (ivy-state-current ivy-last)))

This comment has been minimized.

Copy link
@basil-conto

basil-conto Jul 20, 2018

Collaborator

I forgot, there's ivy--dirname-p for this.

ivy.el Outdated
(length ivy--all-candidates))))
(s (copy-sequence ivy-count-format)))
((string-match "%d.*\\(%d\\)" ivy-count-format)
(let ((w (1+ (floor (log (max 1 (length ivy--all-candidates)) 10))))

This comment has been minimized.

Copy link
@basil-conto

basil-conto Jul 20, 2018

Collaborator

This should be let*.

ivy.el Outdated
(s (copy-sequence ivy-count-format)))
((string-match "%d.*\\(%d\\)" ivy-count-format)
(let ((w (1+ (floor (log (max 1 (length ivy--all-candidates)) 10))))
(s (replace-match (format "%%-%dd" w) t t s 1)))

This comment has been minimized.

Copy link
@basil-conto

basil-conto Jul 20, 2018

Collaborator

The fourth argument to replace-match should be ivy-count-format.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented Jul 20, 2018

Apart from my last few comments, LGTM.

@sergv sergv force-pushed the sergv:master branch Jul 20, 2018

@sergv

This comment has been minimized.

Copy link
Contributor Author

commented Jul 20, 2018

@basil-conto Thanks, I've addressed all of your comments.

@basil-conto

This comment has been minimized.

Copy link
Collaborator

commented Aug 9, 2018

@sergv Can you please resolve the merge conflicts?

@abo-abo Anything holding this up, other than the merge conflict?

@abo-abo

This comment has been minimized.

Copy link
Owner

commented Aug 9, 2018

@basil-conto I think it's fine to merge once the conflicts are resolved.

sergv added 2 commits Apr 4, 2018
sergv added 3 commits Apr 26, 2018

@sergv sergv force-pushed the sergv:master branch to 8efc8ab Aug 9, 2018

@sergv

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2018

@abo-abo @basil-conto I've rebased the branch. Please merge.

@abo-abo abo-abo merged commit 8efc8ab into abo-abo:master Aug 10, 2018

1 of 2 checks passed

continuous-integration/travis-ci/push The Travis CI build is in progress
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@abo-abo

This comment has been minimized.

Copy link
Owner

commented Aug 10, 2018

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.