Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Buffer.lastIndexOf #4846

Closed
wants to merge 2 commits into from
Closed

feature: Buffer.lastIndexOf #4846

wants to merge 2 commits into from

Conversation

@dcposch
Copy link
Contributor

@dcposch dcposch commented Jan 24, 2016

Fixes #4604

work done

  • Added support for Buffer.lastIndexOf to match Buffer.indexOf
  • Can search for a string, another Buffer, or a specific byte value, consistent with Buffer.indexOf
  • For specific byte values, behavior is consistent with Uint8Array.lastIndexOf, which is now shadowed by Buffer.lastIndexOf, so existing code should continue to work
  • Added test cases

work left to do

  • Optimization. The implementation of reverse search in string_search.cc is naive and just uses a double for loop. Ideally we'd adapt BoyerMooreSearch to support reverse search, so that lastIndexOf will be equally fast as indexOf
@Fishrock123
Copy link
Member

@Fishrock123 Fishrock123 commented Jan 24, 2016

cc @trevnorris :)

@mikeal
Copy link
Contributor

@mikeal mikeal commented Jan 25, 2016

Oh wow, I've been waiting years for this :)

This should make writing parsers a lot nicer :)

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Jan 27, 2016

@mikeal cool. I think I'm more or less done!
Let me know if it looks reasonable.

performance

I've refactored so that both lastIndexOf and indexOf use the same fast algorithms: Boyer-Moore / Boyer-Moore-Horspool.

To do this with a minimum of messiness and without any code duplication, I expanded the Vector<Char> class that was already in string_search.h so that it can provide a reversed view onto a the underlying buffer. Then, the existing algorithms (Linear, Boyer-Moore, Boyer-Moore-Horspool) can be applied unmodified.

This works because lastIndexOf(haystack, needle) can be calculated from indexOf(reverse(haystack), reverse(needle)). Reversing the inputs is done via a lightweight view onto the original input buffer: it's not actually copying the buffers or doing anything slow like that.

testing

I added some additional test cases, to make sure that all of the search algorithms were being exercised.

(Background: Under the hood, the existing indexOf starts with Linear search, then if that's too slow, switches to Boyer-Moore-Horspool, which requires only a quick precomputation but has O(nm) worst-case complexity for finding a needle of size m in a haystack of size n. Finally, if that turns out to be too slow, it switches to Boyer-Moore, which requires more precomputation but has linear worst-case complexity.)

Surprisingly, the existing test cases never seem to exercise the Boyer-Moore fallback at all!

I added a test case that does go there.

If you want to reproduce the output below, compile with DEBUG_STRING_SEARCH defined and add a failing assertion to the bottom of test-buffer-indexof.js (otherwise the test runner swallows the output).

=== release test-buffer-indexof ===                                            
Path: parallel/test-buffer-indexof
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search BOYER-MOORE-HORSPOOL
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR
forward search LINEAR

(Above: existing test cases for indexOf. Note it never runs Boyer-Moore.)

(Below: new test cases, added in this PR, for lastIndexOf. Uses the same code to do all the heavy lifting, no code duplication. Tests Boyer-Moore.)

reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search LINEAR
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE
reverse search BOYER-MOORE-HORSPOOL
reverse search BOYER-MOORE
@dcposch dcposch force-pushed the dcposch:master branch Jan 27, 2016
@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Jan 27, 2016

Also, this PR uses memrchr to search for a single byte value in a Buffer, back to front. Like memchr, it's a lot faster than just looping over bytes.

Unfortunately memrchr is a GNU extension, not part of POSIX like memchr.

  1. Does Node have to compile in places where memrchr isn't available?
  2. If so, should I find a polyfill, for example this one?
    https://github.com/c9/node-gnu-tools/blob/master/grep-src/lib/memrchr.c
  3. Alternatively, we can use the magic bits method described here:
    http://cebka.blogspot.com/2015/04/how-fast-is-your-memchr.html
@jasnell jasnell added the wip label Jan 27, 2016
@feross
Copy link
Contributor

@feross feross commented Jan 28, 2016

Thanks for tackling this PR, @dcposch!

@dcposch dcposch force-pushed the dcposch:master branch Jan 28, 2016
@trevnorris
trevnorris reviewed Jan 28, 2016
View changes
src/node_buffer.cc Outdated
@@ -847,31 +882,25 @@ void IndexOfString(const FunctionCallbackInfo<Value>& args) {
SPREAD_ARG(args[0], ts_obj);

Local<String> needle = args[1].As<String>();
int64_t offset_i64 = args[2]->IntegerValue();
bool is_forward = args[4]->BooleanValue();

This comment has been minimized.

@trevnorris

trevnorris Jan 28, 2016
Contributor

Faster to do IsTrue()

@trevnorris
trevnorris reviewed Jan 29, 2016
View changes
src/node_buffer.cc Outdated
@@ -949,6 +983,8 @@ void IndexOfBuffer(const FunctionCallbackInfo<Value>& args) {
THROW_AND_RETURN_UNLESS_BUFFER(Environment::GetCurrent(args), args[0]);
SPREAD_ARG(args[0], ts_obj);
SPREAD_ARG(args[1], buf);
int64_t offset_i64 = args[2]->IntegerValue();
bool is_forward = args[4]->BooleanValue();

if (buf_length > 0)
CHECK_NE(buf_data, nullptr);

This comment has been minimized.

@trevnorris

trevnorris Jan 29, 2016
Contributor

This is an artifact of old code. SPREAD_ARG already does this check. Feel free to remove these checks in applicable methods.

@trevnorris
trevnorris reviewed Jan 29, 2016
View changes
lib/buffer.js Outdated
throw new TypeError('"val" argument must be string, number or Buffer');
}

function slowIndexOf(buffer, val, byteOffset, encoding, dir) {

This comment has been minimized.

@trevnorris

trevnorris Jan 29, 2016
Contributor

style nit: functions in this file are two lines apart.

@@ -949,6 +983,8 @@ void IndexOfBuffer(const FunctionCallbackInfo<Value>& args) {
THROW_AND_RETURN_UNLESS_BUFFER(Environment::GetCurrent(args), args[0]);

This comment has been minimized.

@trevnorris

trevnorris Jan 29, 2016
Contributor

Need to place THROW_AND_RETURN_UNLESS_BUFFER(Environment::GetCurrent(args), args[1]);. The value passed to lastIndexOf() could have had it's prototype messed with, resulting in the following:

$ ./node -e 'Buffer(4).lastIndexOf({__proto__: Buffer.prototype})'
node: ../src/node_buffer.cc:985: void node::Buffer::IndexOfBuffer(const FunctionCallbackInfo<v8::Value> &): Assertion `(args[1])->IsUint8Array()' failed.
Aborted (core dumped)

And even though they're messing around in a way they shouldn't, it still shouldn't be possible to cause an abort using the JS API.

This comment has been minimized.

@dcposch

dcposch Jan 29, 2016
Author Contributor

sure, i can add that, but how did this work before? IndexOfBuffer was used by indexOf before this PR and didn't do the check

This comment has been minimized.

@dcposch

dcposch Jan 30, 2016
Author Contributor

@trevnorris added

This comment has been minimized.

@trevnorris

trevnorris Feb 1, 2016
Contributor

Yup. It crashed before. Thanks for bringing it to my attention. :)

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Jan 29, 2016

Started to review, but have to step away. Will finish this tomorrow.

@dcposch Nice job on the inline comments. Makes the patch easier to follow. Especially for one this size.

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Jan 29, 2016

@trevnorris thx. Fixed all the things you pointed out so far

@trevnorris
trevnorris reviewed Jan 29, 2016
View changes
lib/buffer.js Outdated
} else if (byteOffset < -0x80000000) {
byteOffset = -0x80000000;
}
if (typeof byteOffset !== 'number' || isNaN(byteOffset)) {

This comment has been minimized.

@trevnorris

trevnorris Jan 29, 2016
Contributor

This shouldn't be necessary. To follow string.lastIndexOf() the offset should be coerced to a primitive. So for example 'abcde'.lastIndexOf('c', [1]) === -1. Basically the byteOffset >>= 0 below should be enough.

This comment has been minimized.

@dcposch

dcposch Jan 30, 2016
Author Contributor

I think we need this to match the behavior of Buffer.indexOf

  • buf.indexOf('foo') searches the whole buffer, as does buf.indexOf('foo', null), buf.indexOf('foo', 'foo'), etc
  • buf.indexOf('foo', 0) searches starting from index 0, which also searches the whole buffer
  • buf.lastIndexOf('foo') should def search the whole buffer, but
  • buf.lastIndexOf('foo', 0) does a reverse search starting from index 0, so it only checks for a match at index 0

So a minimum, we have to special-case undefined

I think it's best if buf.lastIndexOf('foo', null), buf.lastIndexOf('foo', NaN) etc match buf.lastIndexOf('foo') -- in other words, they should search the whole buffer. That means they're NOT equivalent to buf.lastIndexOf('foo', 0)

This comment has been minimized.

@trevnorris

trevnorris Feb 1, 2016
Contributor

I agree. The operation is as simple as offset = +offset. This will coerce all isNaN() values to NaN, which can then be checked by Number.isNaN(). It will also coerce values like [2] to 2, which is also how String#lastIndexOf() operates.

So any value that returns true for Number.isNaN() after the coercion is set to the default value. Though note this does exclude null. Which is the same way strings work. e.g. 'abc'.lastIndexOf('b', null) === -1.

This comment has been minimized.

@dcposch

dcposch Feb 2, 2016
Author Contributor

fixed. i added a test case to ensure 'abc'.lastIndexOf('b', null) === -1

}

if (haystack_length < offset || needle_length + offset > haystack_length) {
size_t offset = static_cast<size_t>(opt_offset);

This comment has been minimized.

@trevnorris

trevnorris Jan 29, 2016
Contributor

I'm a paranoid person. Mind adding a CHECK_LT(offset, haystack_length); just after this cast? Then also changing this conditional above from opt_offset == -1 to opt_offset <= -1. That should help handle accidental under/overflow cases.

There's one other place where this is applicable. Will make a note when I get to it.

This comment has been minimized.

@dcposch

dcposch Jan 30, 2016
Author Contributor

Sounds good, done

This comment has been minimized.

@dcposch

dcposch Jan 30, 2016
Author Contributor

@trevnorris added CHECK_LT in the three places where I factored out IndexOfOffset

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Feb 1, 2016

@trevnorris notice me senpai

@Fishrock123
Copy link
Member

@Fishrock123 Fishrock123 commented Feb 1, 2016

@dcposch some of us who work on this more do take the weekends off. ;)

@trevnorris
trevnorris reviewed Feb 1, 2016
View changes
src/string_search.h Outdated
#define DEBUG_TRACE(s) printf("%s search %s\n", \
subject.forward() ? "forward" : "reverse", s);
#else
#define DEBUG_TRACE(s) // no-op

This comment has been minimized.

@trevnorris

trevnorris Feb 1, 2016
Contributor

@bnoordhuis have any comments on this?

This comment has been minimized.

@bnoordhuis

bnoordhuis Feb 1, 2016
Member

I'd leave this out.

This comment has been minimized.

@dcposch

dcposch Feb 2, 2016
Author Contributor

Removed. Note though that this was the only way I noticed some serious missing test coverage: the whole Boyer-Moore algorithm (nontrivial code, rarely exercised) was never reached by the unit tests before. (The unit tests previously only cover Boyer-Moore-Horspool, Linear, and Single-Char. After this PR they cover Boyer-Moore as well.)

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Feb 1, 2016

@dcposch Left few more comments. Now that the weekend is over will be more attentive. :)

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Feb 2, 2016

@trevnorris fixed. Thanks for checking it out!

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Feb 2, 2016

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Feb 2, 2016

@trevnorris I clicked Authorize, but it says

Access Denied
dcposch is missing the Overall/Read permission
@rvagg
Copy link
Member

@rvagg rvagg commented Feb 3, 2016

Sorry @dcposch, we have CI in lockdown until we get our security releases out, you'll have to rely on collaborators to get you info on how the jobs have gone until it's opened back up again next week. / #4857

There are compile errors on OSX:

In file included from ../src/node_buffer.cc:7:
../src/string_search.h:287:11: error: use of undeclared identifier 'memrchr'; did you mean 'memchr'?
    pos = memrchr(subject.start(), pattern_first_char, subj_len - index);
          ^~~~~~~
          memchr
/usr/include/string.h:70:7: note: 'memchr' declared here
void *memchr(const void *, int, size_t);
      ^
../src/node_buffer.cc:1061:11: error: use of undeclared identifier 'memrchr'; did you mean 'memchr'?
    ptr = memrchr(ts_obj_data, needle, offset + 1);
          ^~~~~~~
          memchr
/usr/include/string.h:70:7: note: 'memchr' declared here
void *memchr(const void *, int, size_t);
      ^
In file included from ../src/node_buffer.cc:7:
../src/string_search.h:252:18: error: use of undeclared identifier 'memrchr'
      void_pos = memrchr(subject.start(), search_byte, bytes_to_search);
                 ^
../src/string_search.h:308:10: note: in instantiation of function template specialization 'node::stringsearch::FindFirstCharacter<unsigned short>' requested here
  return FindFirstCharacter(search->pattern_, subject, index);
         ^
../src/string_search.h:107:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::SingleCharSearch' requested here
        strategy_ = &SingleCharSearch;
                     ^
../src/string_search.h:607:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::StringSearch' requested here
  StringSearch<Char> search(pattern);
                     ^
../src/string_search.h:636:36: note: in instantiation of function template specialization 'node::stringsearch::SearchString<unsigned short>' requested here
  size_t pos = node::stringsearch::SearchString(
                                   ^
../src/node_buffer.cc:928:16: note: in instantiation of function template specialization 'node::SearchString<unsigned short>' requested here
      result = SearchString(reinterpret_cast<const uint16_t*>(haystack),
               ^
In file included from ../src/node_buffer.cc:7:
../src/string_search.h:326:9: error: no matching function for call to 'FindFirstCharacter'
    i = FindFirstCharacter(pattern, subject, i);
        ^~~~~~~~~~~~~~~~~~
../src/string_search.h:110:20: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::LinearSearch' requested here
      strategy_ = &LinearSearch;
                   ^
../src/string_search.h:607:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::StringSearch' requested here
  StringSearch<Char> search(pattern);
                     ^
../src/string_search.h:636:36: note: in instantiation of function template specialization 'node::stringsearch::SearchString<unsigned short>' requested here
  size_t pos = node::stringsearch::SearchString(
                                   ^
../src/node_buffer.cc:928:16: note: in instantiation of function template specialization 'node::SearchString<unsigned short>' requested here
      result = SearchString(reinterpret_cast<const uint16_t*>(haystack),
               ^
../src/string_search.h:235:15: note: candidate template ignored: substitution failure [with Char = unsigned short]
inline size_t FindFirstCharacter(Vector<const Char> pattern,
              ^
../src/string_search.h:575:11: error: no matching function for call to 'FindFirstCharacter'
      i = FindFirstCharacter(pattern, subject, i);
          ^~~~~~~~~~~~~~~~~~
../src/string_search.h:113:18: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::InitialSearch' requested here
    strategy_ = &InitialSearch;
                 ^
../src/string_search.h:607:22: note: in instantiation of member function 'node::stringsearch::StringSearch<unsigned short>::StringSearch' requested here
  StringSearch<Char> search(pattern);
                     ^
../src/string_search.h:636:36: note: in instantiation of function template specialization 'node::stringsearch::SearchString<unsigned short>' requested here
  size_t pos = node::stringsearch::SearchString(
                                   ^
../src/node_buffer.cc:928:16: note: in instantiation of function template specialization 'node::SearchString<unsigned short>' requested here
      result = SearchString(reinterpret_cast<const uint16_t*>(haystack),
               ^
../src/string_search.h:235:15: note: candidate template ignored: substitution failure [with Char = unsigned short]
inline size_t FindFirstCharacter(Vector<const Char> pattern,
              ^
5 errors generated.
make[2]: *** [/Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj.target/node/src/node_buffer.o] Error 1
make[2]: *** Waiting for unfinished jobs....
  c++ '-D_DARWIN_USE_64_BIT_INODE=1' '-DNODE_ARCH="x64"' '-DNODE_WANT_INTERNALS=1' '-DV8_DEPRECATION_WARNINGS=1' '-DHAVE_OPENSSL=1' '-DHAVE_DTRACE=1' '-D__POSIX__' '-DNODE_PLATFORM="darwin"' '-DHTTP_PARSER_STRICT=0' '-D_LARGEFILE_SOURCE' '-D_FILE_OFFSET_BITS=64' -I../src -I../tools/msvs/genfiles -I../deps/uv/src/ares -I/Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj/gen -I../deps/v8 -I../deps/cares/include -I../deps/v8/include -I../deps/openssl/openssl/include -I../deps/zlib -I../deps/http_parser -I../deps/uv/include  -Os -gdwarf-2 -mmacosx-version-min=10.5 -arch x86_64 -Wall -Wendif-labels -W -Wno-unused-parameter -std=gnu++0x -fno-rtti -fno-exceptions -fno-threadsafe-statics -fno-strict-aliasing -MMD -MF /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/.deps//Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj.target/node/src/tls_wrap.o.d.raw   -c -o /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/obj.target/node/src/tls_wrap.o ../src/tls_wrap.cc
In file included from ../src/string_search.cc:1:
../src/string_search.h:287:11: error: use of undeclared identifier 'memrchr'; did you mean 'memchr'?
    pos = memrchr(subject.start(), pattern_first_char, subj_len - index);
          ^~~~~~~
          memchr
/usr/include/string.h:70:7: note: 'memchr' declared here
void *memchr(const void *, int, size_t);
      ^
1 error generated.

Windows

c:\workspace\node-compile-windows\label\win-vs2013\src\string_search.h(287): error C3861: 'memrchr': identifier not found (src\node_buffer.cc) [c:\workspace\node-compile-windows\label\win-vs2013\node.vcxproj]

smartos

In file included from ../src/node_buffer.cc:7:0:
../src/string_search.h: In function 'std::size_t node::stringsearch::FindFirstCharacter(node::stringsearch::Vector<const Char>, node::stringsearch::Vector<const Char>, std::size_t) [with Char = unsigned char; std::size_t = long unsigned int]':
../src/string_search.h:287:72: error: 'memrchr' was not declared in this scope
     pos = memrchr(subject.start(), pattern_first_char, subj_len - index);
                                                                        ^
../src/node_buffer.cc: In function 'void node::Buffer::IndexOfNumber(const v8::FunctionCallbackInfo<v8::Value>&)':
../src/node_buffer.cc:1061:50: error: 'memrchr' was not declared in this scope
     ptr = memrchr(ts_obj_data, needle, offset + 1);
                                                  ^
In file included from ../src/node_buffer.cc:7:0:
../src/string_search.h: In instantiation of 'size_t node::stringsearch::FindFirstCharacter(node::stringsearch::Vector<const Char>, node::stringsearch::Vector<const Char>, size_t) [with Char = short unsigned int; size_t = long unsigned int]':
../src/string_search.h:308:61:   required from 'static size_t node::stringsearch::StringSearch<Char>::SingleCharSearch(node::stringsearch::StringSearch<Char>*, node::stringsearch::Vector<const Char>, size_t) [with Char = short unsigned int; size_t = long unsigned int]'
../src/string_search.h:107:21:   required from 'node::stringsearch::StringSearch<Char>::StringSearch(node::stringsearch::Vector<const Char>) [with Char = short unsigned int]'
../src/string_search.h:607:36:   required from 'size_t node::stringsearch::SearchString(node::stringsearch::Vector<const Char>, node::stringsearch::Vector<const Char>, size_t) [with Char = short unsigned int; size_t = long unsigned int]'
../src/string_search.h:637:49:   required from 'size_t node::SearchString(const Char*, size_t, const Char*, size_t, size_t, bool) [with Char = short unsigned int; size_t = long unsigned int]'
../src/node_buffer.cc:933:39:   required from here
../src/string_search.h:252:71: error: 'memrchr' was not declared in this scope
       void_pos = memrchr(subject.start(), search_byte, bytes_to_search);
                                                                       ^
node.target.mk:157: recipe for target '/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/obj.target/node/src/node_buffer.o' failed
make[2]: *** [/home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos14-64/out/Release/obj.target/node/src/node_buffer.o] Error 1

the test also failed to run on armv8 but that was a jenkins problem as far as I can tell.

Looks like there's still quite a bit of work to do on cross-platform compat here.

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Feb 3, 2016

@rvagg @trevnorris thanks!

Yeah this goes back to my first question at the top of the PR, about whether node builds can use memrchr. Looks like they can on some systems but not on others.

I added a fallback for those systems. LMK if the CI is happier now!

@dcposch dcposch force-pushed the dcposch:master branch Feb 3, 2016
@jasnell
Copy link
Member

@jasnell jasnell commented Apr 8, 2016

Whatever makes it in before I start working on it on Monday ;)
On Apr 8, 2016 4:29 PM, "Trevor Norris" notifications@github.com wrote:

@jasnell https://github.com/jasnell If this lands before Monday, could
it make it in, or have all the RC commits been chosen?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4846 (comment)

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Apr 15, 2016

Seems there's a minor performance hit with this PR, but I believe it's within an acceptable level. Anyone have any objections? If not then let's land it.

@Fishrock123
Copy link
Member

@Fishrock123 Fishrock123 commented Apr 15, 2016

🚢

@jasnell
Copy link
Member

@jasnell jasnell commented Apr 15, 2016

Works for me.

@jasnell
Copy link
Member

@jasnell jasnell commented Apr 22, 2016

@trevnorris ... there's still time to get this in. Is it ready to go?
@dcposch ... can you rebase?

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Apr 22, 2016

@dcposch ... can you rebase?

@jasnell yes, will be done v soon

dcposch added 2 commits Jan 28, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems
@dcposch dcposch force-pushed the dcposch:master branch to 698ed92 Apr 23, 2016
@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Apr 23, 2016

@jasnell fixed

@jasnell
Copy link
Member

@jasnell jasnell commented Apr 23, 2016

@jasnell jasnell added this to the 6.0.0 milestone Apr 23, 2016
@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Apr 23, 2016

@jasnell looks like test-tls-inception failed on FreeBSD and tests passed on the other platforms. I don't know if it's related to this change--looks unlikely. Want to try re-running it?

Failing build: https://ci.nodejs.org/job/node-test-commit-freebsd/2167/
Failing test: https://ci.nodejs.org/job/node-test-commit-freebsd/2167/nodes=freebsd10-64/tapTestReport

@dcposch
Copy link
Contributor Author

@dcposch dcposch commented Apr 23, 2016

Sweet, everything worked that time including FreeBSD

jasnell added a commit that referenced this pull request Apr 25, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems

PR-URL: #4846
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@jasnell
Copy link
Member

@jasnell jasnell commented Apr 25, 2016

Landed in 6c1e5ad

@jasnell jasnell closed this Apr 25, 2016
joelostrowski added a commit to joelostrowski/node that referenced this pull request Apr 25, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems

PR-URL: nodejs#4846
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Apr 25, 2016

@jasnell Was the squash/merge button used? I'm trying to figure out why the Author: field was changed from the listed commits.

@jasnell
Copy link
Member

@jasnell jasnell commented Apr 25, 2016

No, I squashed like normal. Didn't notice that the author changed :-/
On Apr 25, 2016 3:48 PM, "Trevor Norris" notifications@github.com wrote:

@jasnell https://github.com/jasnell Was the squash/merge button used?
I'm trying to figure out why the Author: field was changed from the
listed commits.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#4846 (comment)

@trevnorris
Copy link
Contributor

@trevnorris trevnorris commented Apr 25, 2016

Strange. Oh well. Nothing serious.

jasnell added a commit that referenced this pull request Apr 26, 2016
* Remove unnecessary templating from SearchString

  SearchString used to have separate PatternChar and SubjectChar template type
  arguments, apparently to support things like searching for an 8-bit string
  inside a 16-bit string or vice versa. However, SearchString is only used from
  node_buffer.cc, where PatternChar and SubjectChar are always the same. Since
  this is extra complexity that's unused and untested (simplifying to a single
  Char template argument still compiles and didn't break any unit tests), I
  removed it.

* Use Boyer-Hoore[-Horspool] for both indexOf and lastIndexOf

  Add test cases for lastIndexOf. Test the fallback from BMH to
  Boyer-Moore, which looks like it was totally untested before.

* Extra bounds checks in node_buffer.cc

* Extra asserts in string_search.h

* Buffer.lastIndexOf: clean up, enforce consistency w/ String.lastIndexOf

* Polyfill memrchr(3) for non-GNU systems

PR-URL: #4846
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants