Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary length checks + improved indexing of charCodeCache #10

Merged
merged 1 commit into from
Jun 14, 2017

Conversation

gustf
Copy link
Contributor

@gustf gustf commented May 15, 2017

removed unnecessary length checks + improved indexing of charCodeCache

@sindresorhus
Copy link
Owner

Can you explain your changes? Those things are there for a reason.

@@ -65,7 +53,7 @@ module.exports = function (a, b) {
var j = 0;

while (i < aLen) {
charCodeCache[start + i] = a.charCodeAt(start + i);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to add start here to the charCodeCache, as long as we do the same in the loop.

This make the bench ~4-5% faster on my cpu

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense indeed. Can't remember why I did that thusly.

@@ -37,10 +29,6 @@ module.exports = function (a, b) {
bLen--;
}

if (aLen === 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is taken care of in the prefix loop

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, the check at line 56 kicks in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again this is an early exit and result should be benchmarked.

return bLen;
}

if (bLen === 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here aLen is always smaller or equal than bLen due to the swap, i.e. always false (if not or equal to aLen)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what you mean is, that if bLen is 0, then aLen must be 0 as well and the check at line 56 will kick in again and returns 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here I guess that since we swapped the strings, we can remove this condition since if both strings have 0 length, condition in line 7 will handle it.

@@ -20,14 +20,6 @@ module.exports = function (a, b) {
var aLen = a.length;
var bLen = b.length;

if (aLen === 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is taken care of in the suffix loop

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If aLen is equal to 0, the check at line 56 kicks in ending up in the same result. So yes, the result is the same. However, this is an early exit and thus you can argue that the faster we can exit and the less code we should execute, the better. But yes, this one can safely be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right indeed. But I guess we should benchmark the change to see if this yields any positive result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed it could be faster in some corner-cases

@sindresorhus
Copy link
Owner

sindresorhus commented May 16, 2017

See #8 and #7.

// @Yomguithereal

@gustf
Copy link
Contributor Author

gustf commented May 16, 2017

In general, I think removing or keeping the checks will not make a big difference performance-wise. Maybe keep some of them for clarity?

@Yomguithereal
Copy link
Collaborator

I guess the line 27 check can be removed. One solution would be to comment the checks clearly as early exits.

@gustf
Copy link
Contributor Author

gustf commented May 16, 2017

I will come back to this after work today

@Yomguithereal
Copy link
Collaborator

Can we speak about this also? You seem to have an edge there :). Haven't had much time to read your code but it seems you can speed up computations by performing several computations in one iteration of the loop? Is that it?

@gustf
Copy link
Contributor Author

gustf commented May 16, 2017

Can we speak about this also? You seem to have an edge there :). Haven't had much time to read your code but it seems you can speed up computations by performing several computations in one iteration of the loop? Is that it?

Yes thats it, I do loop unrolling on the outer loop and calculating 4 rows at a time in the inner loop :)

I did some reviewing regarding the checks I came to the following:

  1. Check on line 23: Bail when a is an empty string
  2. Check on line 27: Unnecessary check on bLen
  3. Check on line 40: Bail when a as a whole is a suffix to b

In case 1 and 3 the checks makes it slightly faster, but makes all "fall through" cases slightly slower.
2 can be safely removed regardless.

What do you think?

@gustf
Copy link
Contributor Author

gustf commented May 19, 2017

Sorry but I might misunderstand what was decided, did you want me to change the PR to keep the checks?

@sindresorhus
Copy link
Owner

@Yomguithereal What do you think?

@Yomguithereal
Copy link
Collaborator

I don't remember clearly where we were but I guess the idea would be, concerning the remaining and maybe redundant checks, to only keep them if the benchmark differences are negligible and if it enhances readability of the algorithm so one does not have to understand that we don't make such check here because it can be found aggregated with another one somewhere else.

@Yomguithereal
Copy link
Collaborator

In case 1 and 3 the checks makes it slightly faster, but makes all "fall through" cases slightly slower.
2 can be safely removed regardless.

I agree that we can drop n°2.

For the other ones, I guess the "fall through" scenario is the most common one to be expected in real use cases. So @gustf would you advocate to remove them (by how much is that faster in your benchmarks)?

@Yomguithereal
Copy link
Collaborator

Yomguithereal commented May 31, 2017

Afterwards, with your accord @gustf, I guess we could work on integrating and understanding your loop unrolling into this library?

On a side note, the Levensthein distance is often used to see if two strings are similar, which can be said if their distance is less than some threshold. In those cases, you will gain tremendous performance by using a capped Levenshtein distance function such as this one.

@gustf
Copy link
Contributor Author

gustf commented Jun 12, 2017

@Yomguithereal Sorry for late reply.

I haven't done any benchmarking with or without the checks but I think it will not really make a big difference. But since there will be less comparisons in the general case I think removing them makes sense.

And sure, we can do some work integrating the loop unrolling. I will have some more time in a few weeks. Very busy at work now before summer holidays.

@Yomguithereal
Copy link
Collaborator

Well I guess this can be merged then, @sindresorhus?

@sindresorhus sindresorhus changed the title removed unnecessary length checks + improved indexing of charCodeCache Remove unnecessary length checks + improved indexing of charCodeCache Jun 14, 2017
@sindresorhus sindresorhus merged commit 37573fe into sindresorhus:master Jun 14, 2017
ntwb pushed a commit to stylelint/stylelint that referenced this pull request Mar 13, 2019
## The dependency [leven](https://github.com/sindresorhus/leven) was updated from `2.1.0` to `3.0.0`.
This version is **not covered** by your **current version range**.

If you don’t accept this pull request, your project will work just like it did before. However, you might be missing out on a bunch of new features, fixes and/or performance improvements from the dependency update.

---

<details>
<summary>Release Notes for v3.0.0</summary>

<p>Breaking:</p>
<ul>
<li>Require Node.js 6 (<a class="issue-link js-issue-link" data-error-text="Failed to load issue title" data-id="419602561" data-permission-text="Issue title is private" data-url="sindresorhus/leven#12" data-hovercard-type="pull_request" data-hovercard-url="/sindresorhus/leven/pull/12/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/pull/12">#12</a>)  <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8"><tt>3eb8c04</tt></a></li>
</ul>
<p>Enhancements:</p>
<ul>
<li>Improve performance (<a class="issue-link js-issue-link" data-error-text="Failed to load issue title" data-id="228838401" data-permission-text="Issue title is private" data-url="sindresorhus/leven#10" data-hovercard-type="pull_request" data-hovercard-url="/sindresorhus/leven/pull/10/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/pull/10">#10</a>)  <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/sindresorhus/leven/commit/37573fe34466202f0f2cdbf04cefe5d1968eed88/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/commit/37573fe34466202f0f2cdbf04cefe5d1968eed88"><tt>37573fe</tt></a></li>
<li>Add TypeScript definition (<a class="issue-link js-issue-link" data-error-text="Failed to load issue title" data-id="419602561" data-permission-text="Issue title is private" data-url="sindresorhus/leven#12" data-hovercard-type="pull_request" data-hovercard-url="/sindresorhus/leven/pull/12/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/pull/12">#12</a>)  <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8/hovercard" href="https://urls.greenkeeper.io/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8"><tt>3eb8c04</tt></a></li>
</ul>
<p><a class="commit-link" href="https://urls.greenkeeper.io/sindresorhus/leven/compare/v2.1.0...v3.0.0"><tt>v2.1.0...v3.0.0</tt></a></p>
</details>

<details>
<summary>Commits</summary>
<p>The new version differs by 4 commits.</p>
<ul>
<li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/254cdab92b2a3154dfb219ec0fcaa9645096e0f1"><code>254cdab</code></a> <code>3.0.0</code></li>
<li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/06fda13f82dd3b86581f6815fa6e6a0b8cbed733"><code>06fda13</code></a> <code>Add a more realistic benchmark fixture</code></li>
<li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/3eb8c04c42c763be76f713965c2847443f365aa8"><code>3eb8c04</code></a> <code>Require Node.js 6, add TypeScript definition (#12)</code></li>
<li><a href="https://urls.greenkeeper.io/sindresorhus/leven/commit/37573fe34466202f0f2cdbf04cefe5d1968eed88"><code>37573fe</code></a> <code>Remove unnecessary length checks + improved indexing of charCodeCache (#10)</code></li>
</ul>
<p>See the <a href="https://urls.greenkeeper.io/sindresorhus/leven/compare/0630566a69b5a73aae2e52bb47ea863892a4b5f0...254cdab92b2a3154dfb219ec0fcaa9645096e0f1">full diff</a></p>
</details>

<details>
  <summary>FAQ and help</summary>

  There is a collection of [frequently asked questions](https://greenkeeper.io/faq.html). If those don’t help, you can always [ask the humans behind Greenkeeper](https://github.com/greenkeeperio/greenkeeper/issues/new).
</details>

---


Your [Greenkeeper](https://greenkeeper.io) bot 🌴
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants