Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'string-match-p' instead of 'string-match' where applicable #1517

Merged
merged 5 commits into from Aug 10, 2018
Merged

Use 'string-match-p' instead of 'string-match' where applicable #1517

merged 5 commits into from Aug 10, 2018

Conversation

sergv
Copy link
Contributor

@sergv sergv commented Apr 1, 2018

This should cause Emacs to do a little bit less work when matching regexps (judging from C code).

PS Although I tried to not use string-match-p if match data is subsequently accessed, I may have missed something. Please do re-verify that everything's ok before merging.

ivy.el Outdated
@@ -910,14 +910,14 @@ If the text hasn't changed as a result, forward to `ivy-alt-done'."
(postfix (car (last parts)))
(case-fold-search (ivy--case-fold-p ivy-text))
(completion-ignore-case case-fold-search)
(startp (string-match "^\\^" postfix))
(startp (string-matc-ph "^\\^" postfix))
Copy link
Collaborator

@basil-conto basil-conto Apr 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/matc-ph/match-p/

Copy link
Contributor Author

@sergv sergv Apr 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

ivy.el Outdated
@@ -1963,14 +1963,14 @@ This is useful for recursive `ivy-read'."
(let ((w (length (number-to-string
(length ivy--all-candidates))))
(s (copy-sequence ivy-count-format)))
(string-match "%d" s)
(string-match-p "%d" s)
(match-end 0)
Copy link
Owner

@abo-abo abo-abo Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a no-op right? And it was in the code before your change?

Copy link
Contributor Author

@sergv sergv Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that's the issue I was talking about. This is not a no-op and my change is incorrect: before the change string-match modified global match data which was subsequently read by match-end. The string-match-p leaves match data intact, so match-end now reads whatever was previously in that global structure. I'll revert this - very unprofessional of me, sorry.

Copy link
Collaborator

@basil-conto basil-conto Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergv One of the calls to match-end is a no-op though, as its value isn't used anywhere.

Copy link
Contributor Author

@sergv sergv Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, indeed.

ivy.el Outdated
(match-end 0)
(string-match "%d" s (match-end 0))
(string-match-p "%d" s (match-end 0))
(setq s (replace-match (format "%%-%dd" w) nil nil s))
Copy link
Owner

@abo-abo abo-abo Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't replace-match use match-data? I think this will introduce a bug. Can you confirm?

Copy link
Collaborator

@basil-conto basil-conto Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only replace-match uses match-data, but also the match-end passed to the second string-match from the first string-match. So neither of the two string-matches should be changed, at least at first glance.

Copy link
Contributor Author

@sergv sergv Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my changes are wrong here.

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Apr 3, 2018

Thanks for the contribution. Two questions:

  • Do you have an Emacs Copyright Assignment?
  • Can you give a guesstimate of the speed-up? If it's not too big, maybe we should go for the flexibility during debug.

@sergv
Copy link
Contributor Author

@sergv sergv commented Apr 3, 2018

Thanks for quick response! Per your questions:

  • No, I don't have an Emacs Copyright Assignment
  • The speed up is a bit tricky. So far I considered it to be just "faster" (the computer is going to do less work, how that can be slow?). But after your request I tried to come up with a benchmark that will show the difference. I was executing following code a couple of times, but unfortunately I didn't get any consistent results (a few runs are provided below). Sometimes string-match-p was marginally faster, sometimes string-match was faster and it was changing between the runs. I've even tried to manually unpack string-match-p and lift the way it disables group capture outside the benchmaark loop body, but that didn't really matter to the benchmarks.
(let ((re "a+\\(b*\\)c+")
      (input (concat (make-string 100 ?a)
                     (make-string 100 ?b)
                     (make-string 100 ?c)))
      (repeats 100000))
  (eval
   `(progn      
      (defun test1 ()
        (garbage-collect)
        (benchmark-run ,repeats
          (string-match-p ,re ,input)))
      (defun test2 ()
        (garbage-collect)
        (benchmark-run ,repeats
          (string-match ,re ,input)))
      ;; Manually unpacked `string-match-p' that sets `inhibit-changing-match-data'
      ;; outside the benchmark loop.
      (defun test3 ()
        (garbage-collect)
        (let ((inhibit-changing-match-data t))
          (benchmark-run ,repeats
            (string-match ,re ,input))))))
  (byte-compile #'test1)
  (byte-compile #'test2)
  (byte-compile #'test3)
  (list (list 'string-match-p (car (test1)))
        (list 'string-match (car (test2)))
        (list 'hacky (car (test3)))))

Some sample runs:

((string-match-p 0.646643222) (string-match 0.655887884) (hacky 0.653618295))
((string-match-p 0.655958689) (string-match 0.694801363) (hacky 0.6770779859999999))
((string-match-p 0.701573058) (string-match 0.688380931) (hacky 0.6642466770000001))
((string-match-p 0.67882954) (string-match 0.696929307) (hacky 0.69309022))
((string-match-p 0.6775590699999999) (string-match 0.729273346) (hacky 0.692307847))
((string-match-p 0.566093728) (string-match 0.563458819) (hacky 0.548185506))
((string-match-p 0.615039049) (string-match 0.620842084) (hacky 0.64707289))
((string-match-p 0.641689823) (string-match 0.634179539) (hacky 0.646996679))

Thus, given that

  1. The change is highly error-prone;
  2. There appears to be no significant speedup to be gained;
  3. We'll have to wait until Copyright Assignment process commences

I propose to drop this change as pretty minor and close this PR. Sorry for inconvenience.

@rgrinberg
Copy link
Contributor

@rgrinberg rgrinberg commented Apr 4, 2018

One advantage of using the -p variants is that you protect yourself against errors when you forget save-match-data. Swiper is unlikely to have such bugs, but it's still nice to be more secure.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented Apr 4, 2018

@rgrinberg That's very true. In addition, it's also important IMO to convey the correct intent to the reader, something I think comes up more frequently and has a greater chance of introducing subtle future bugs.

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Apr 4, 2018

No, I don't have an Emacs Copyright Assignment

Are you willing to get it? I can only merge changes up to 15 lines without a CA.

I propose to drop this change as pretty minor and close this PR. Sorry for inconvenience.

it's also important IMO to convey the correct intent to the reader, something I think comes up more frequently and has a greater chance of introducing subtle future bugs.

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Apr 4, 2018

No, I don't have an Emacs Copyright Assignment

Are you willing to get it? I can only merge changes up to 15 lines without a CA.

I propose to drop this change as pretty minor and close this PR.

Dropping it is an option, sure. But I do believe some small speed-up can be gained, and further more string-match-p conveys the intent better.

Sorry for inconvenience.

No trouble at all, this can be a good change provided it doesn't introduce bugs. Maybe we can do it in smaller chunks to make sure no string-match that actually needs to be there gets replaced.

it's also important IMO to convey the correct intent to the reader, something I think comes up more frequently and has a greater chance of introducing subtle future bugs.

Agreed.

@sergv
Copy link
Contributor Author

@sergv sergv commented Apr 4, 2018

Okay, if there's consensus that this is a nice thing to have then lets keep the PR.I have rebased the branch and addressed the issues identified previously. I'll look into obtaining Emacs Copyright Assignment and get back to you.

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Apr 5, 2018

@sergv Thanks.

@sergv
Copy link
Contributor Author

@sergv sergv commented Apr 25, 2018

@abo-abo I have received confirmation that my copyright assignment to the FSF is now complete. I believe we can merge now?

ivy.el Outdated
@@ -1965,14 +1965,12 @@ This is useful for recursive `ivy-read'."
(let ((w (length (number-to-string
(length ivy--all-candidates))))
(s (copy-sequence ivy-count-format)))
(string-match "%d" s)
(match-end 0)
Copy link
Collaborator

@basil-conto basil-conto Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to match-end is indeed a no-op, but I'm not sure you can get rid of the call to string-match; there's a call to match-end just after these two forms.

Copy link
Contributor Author

@sergv sergv Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string-match here is followed by another string-match, which would overwrite match data.

Copy link
Collaborator

@basil-conto basil-conto Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergv

string-match here is followed by another string-match, which would overwrite match data.

Yes, but the second string-match looks like it uses the match-end of the first string-match as its starting index.

Copy link
Contributor Author

@sergv sergv Apr 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have found a way to replace two string-matches with a single one. Please take a look at Simplify ivy-add-prompt-count commit.

Copy link
Collaborator

@basil-conto basil-conto Apr 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergv

I have found a way to replace two string-matches with a single one.

Nice, but why stop there? :)

diff --git a/ivy.el b/ivy.el
index 193af0a..d154bbf 100644
--- a/ivy.el
+++ b/ivy.el
@@ -1987,17 +1987,11 @@ ivy-add-prompt-count
         ((null ivy-count-format)
          (error
           "`ivy-count-format' can't be nil.  Set it to \"\" instead"))
-        ((string-match "%d.*%d" ivy-count-format)
-         (let ((w (length (number-to-string
-                           (length ivy--all-candidates))))
-               (s (copy-sequence ivy-count-format)))
+        ((string-match "%d.*\\(%d\\)" ivy-count-format)
+         (let* ((w (1+ (floor (log (max 1 (length ivy--all-candidates)) 10))))
+                (s (replace-match (format "%%-%dd" w) t t ivy-count-format 1)))
            (string-match "%d" s)
-           (match-end 0)
-           (string-match "%d" s (match-end 0))
-           (setq s (replace-match (format "%%-%dd" w) nil nil s))
-           (string-match "%d" s)
-           (concat (replace-match (format "%%%dd" w) nil nil s)
-                   prompt)))
+           (concat (replace-match (format "%%%dd" w) t t s) prompt)))
         ((string-match "%.*d" ivy-count-format)
          (concat ivy-count-format prompt))
         (ivy--directory

This way, there is less searching, less code, way less string allocation, and you also get to fix the replace-match arguments FIXEDCASE and LITERAL which shouldn't have been nil to begin with.

ivy.el Outdated
@@ -2955,13 +2953,13 @@ All CANDIDATES are assumed to match NAME."
"Re-sort candidates by NAME.
All CANDIDATES are assumed to match NAME.
Prefix matches to NAME are put ahead of the list."
(if (or (string-match "^\\^" name) (string= name ""))
(if (or (string-match-p "^\\^" name) (string= name ""))
Copy link
Collaborator

@basil-conto basil-conto Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the regexp here and in a couple of places further down be:

"\\`\\^"

?

Copy link
Contributor Author

@sergv sergv Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if there's only a single line in the name string then there would be no difference. I don't have a strong opinion in this case so I left it as is since it appears to be working OK.

Copy link
Collaborator

@basil-conto basil-conto Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if there's only a single line in the name string then there would be no difference.

Yes, but if this code shouldn't be matching line beginnings then it shouldn't be matching line beginnings. My question is whether this code should be matching line beginnings. :)

ivy.el Outdated
@@ -3831,7 +3829,7 @@ Don't finish completion."
(interactive)
(delete-minibuffer-contents)
(if (and ivy--directory
(string-match "/$" (ivy-state-current ivy-last)))
(string-match-p "/$" (ivy-state-current ivy-last)))
Copy link
Collaborator

@basil-conto basil-conto Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the regexp here be:

"/\\'"

?

Copy link
Contributor Author

@sergv sergv Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with replacement of ^, there is no difference if we're matching against a single line. I don't have a strong opinion on the matter. Do you feel it should be addressed?

Copy link
Collaborator

@basil-conto basil-conto Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel it should be addressed?

Unless I'm missing something, then yes, because newlines are valid filename constituents. This can obviously be done as a separate commit or PR; I'm just drawing attention to it while we're here.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented Jul 19, 2018

What's the status on this?

@sergv
Copy link
Contributor Author

@sergv sergv commented Jul 19, 2018

@basil-conto I've replaced "^" by "\`" and "$" by "\'", rebased the branch and I also got the Copyright Assignment. I think the PR can be merged now.

ivy.el Outdated
@@ -3918,7 +3913,7 @@ Don't finish completion."
(interactive)
(delete-minibuffer-contents)
(if (and ivy--directory
(string-match "/$" (ivy-state-current ivy-last)))
(string-match-p "/\\'" (ivy-state-current ivy-last)))
Copy link
Collaborator

@basil-conto basil-conto Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot, there's ivy--dirname-p for this.

ivy.el Outdated
(length ivy--all-candidates))))
(s (copy-sequence ivy-count-format)))
((string-match "%d.*\\(%d\\)" ivy-count-format)
(let ((w (1+ (floor (log (max 1 (length ivy--all-candidates)) 10))))
Copy link
Collaborator

@basil-conto basil-conto Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be let*.

ivy.el Outdated
(s (copy-sequence ivy-count-format)))
((string-match "%d.*\\(%d\\)" ivy-count-format)
(let ((w (1+ (floor (log (max 1 (length ivy--all-candidates)) 10))))
(s (replace-match (format "%%-%dd" w) t t s 1)))
Copy link
Collaborator

@basil-conto basil-conto Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fourth argument to replace-match should be ivy-count-format.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented Jul 20, 2018

Apart from my last few comments, LGTM.

@sergv
Copy link
Contributor Author

@sergv sergv commented Jul 20, 2018

@basil-conto Thanks, I've addressed all of your comments.

@basil-conto
Copy link
Collaborator

@basil-conto basil-conto commented Aug 9, 2018

@sergv Can you please resolve the merge conflicts?

@abo-abo Anything holding this up, other than the merge conflict?

@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Aug 9, 2018

@basil-conto I think it's fine to merge once the conflicts are resolved.

@sergv
Copy link
Contributor Author

@sergv sergv commented Aug 9, 2018

@abo-abo @basil-conto I've rebased the branch. Please merge.

@abo-abo abo-abo merged commit 8efc8ab into abo-abo:master Aug 10, 2018
1 of 2 checks passed
@abo-abo
Copy link
Owner

@abo-abo abo-abo commented Aug 10, 2018

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants